Developmental gene expression programmes are coordinated by the specialized distal cis-regulatory elements called enhancers, which integrate lineage- and signalling-dependent inputs to guide morphogenesis. In previous work, we characterized the genome-wide repertoire of active enhancers in human neural crest cells (hNCC), an embryonic cell population with critical roles in craniofacial development. We showed that in hNCC, co-occupancy of a master regulator TFAP2A with nuclear receptors NR2F1 and NR2F2 correlates with the presence of permissive enhancer chromatin states. Here, we take advantage of pre-existing human genetic variation to further explore potential cooperation between TFAP2A and NR2F1/F2. We demonstrate that isolated single nucleotide polymorphisms affecting NR2F1/F2-binding sites within hNCC enhancers can alter TFAP2A occupancy and overall chromatin features at the same enhancer allele. We propose that a similar strategy can be used to elucidate other cooperative relationships between transcription factors involved in developmental transitions. Using the neural crest and its major contribution to human craniofacial phenotypes as a paradigm, we discuss how genetic variation might modulate the molecular properties and activity of enhancers, and ultimately impact human phenotypic diversity.
1. Developmental enhancers as platforms for cooperative transcription factor recruitment and integration of signalling cues
In the course of embryonic development multiple cell types, tissues and organs are progressively generated from a single cell, the zygote. These different cell types share nearly identical genetic information, yet each possesses unique properties that are driven by expression of a characteristic set of genes. It is now widely accepted that enhancers, a group of distal cis-regulatory elements, are crucial in the establishment of cell-type- and developmental stage-specific gene expression patterns [1–6]. According to their classical definition, enhancers are compact (approx. 200–500 bp) DNA sequences containing clusters of transcription factor (TF)-binding motifs that control expression of target promoters over long distances and in an orientation-independent manner [2,7]. The spatio-temporal activity of developmental enhancers is largely determined by the combinatorial binding of TFs to their cognate motifs [5,8–10]. The large number of TFs encoded by mammalian genomes, their frequent tissue specificity and their responsiveness to signalling ensures a rich repertoire of context-dependent TF combinations. TFs representing both lineage-specific regulators as well as sequence-dependent effectors of signal transduction pathways commonly converge at enhancer elements to activate transcription, thus integrating intrinsic and extrinsic environmental signalling cues [1,11–13]. Such integration allows for exquisite spatial and temporal control of gene expression during development.
To gain access to DNA, TFs have to compete with nucleosomes that occlude TF-binding motifs and block protein–DNA interactions [14–16]. The DNA-binding affinity of an individual TF is typically much lower than that of a histone octamer–DNA complex, hence cooperative binding of multiple factors is thought to play a major role in overcoming the nucleosomal barrier in TF recruitment [17–19]. Such cooperativity may depend on direct physical association between TFs, or may occur in the absence of direct interaction by simply promoting nucleosomal eviction [5,18]. In some instances recruitment of TFs to enhancer elements is sequential, with the so-called ‘pioneer’ factors able to first gain access to the nucleosomal DNA, either in isolation or through interaction with chromatin remodellers and histone chaperones [20–27]. Such pioneer factors can subsequently facilitate binding of other TFs and the assembly of coactivator complexes [20,22–24]. Coactivator recruitment allows an additional layer of regulatory integration, as TF cooperativity provides multiple binding surfaces for general coactivators or for an increasing repertoire of coactivators with distinct chromatin remodelling and modifying activities [2,28,29]. Cumulatively, this enables unique TF combinations to synergistically establish permissive chromatin states at enhancer elements and promote long-range communication with promoters [2,30,31]. Despite the cell-type specificity of these TF combinations, transcriptionally permissive chromatin states share many conserved features regardless of cellular context. For example, active enhancers display common epigenomic profiles such as nucleosomal depletion at TF-occupied sites flanked by regions enriched for nucleosomes marked by certain histone modifications, including H3K4me1 and H3K27ac (reviewed in [2,32]). In recent years, these features have been exploited to annotate cis-regulatory repertoires through genome-wide mapping of DNase hypersensitive sites, histone modifications and general coactivator occupancy, revealing the staggering preponderance and developmental dynamics of regulatory elements in the human genome [33–36].
2. Epigenomic landscapes of human neural crest
Human embryonic stem cells (hESC) and their in vitro differentiation models combined with epigenomic mapping offer an opportunity to uncover regulatory elements used in transient cell states that arise during early human development which have previously been largely inaccessible for study. One transient cell type of particular developmental and medical relevance is the neural crest (NC), a vertebrate-specific cell population specified early in embryogenesis at the neural plate border territory separating neuroectoderm from the epidermis [37,38]. After specification, NC progenitors undergo an epithelial-to-mesenchymal transition, delaminate from the dorsal neural tube, migrate throughout the body and acquire an extraordinarily broad differentiation potential, giving rise to elements of the craniofacial skeleton, the middle ear, the peripheral nervous system, pigment cells, and certain cardiac structures [37,38]. Cranial neural crest cells originating from the cephalic region of the neural tube produce a large variety of mesenchymal cell types including bone and cartilage, which elsewhere in the body are formed solely by mesodermal derivatives [39,40]. Thus, neural crest cells (NCC) are truly unique, as they not only migrate over unparalleled distances, but also effectively broaden their developmental potential upon specification, thus traversing traditionally delineated germ layer fate restrictions. In addition, aberrant NC development is associated with a broad variety of congenital malformations often including deafness and complex craniofacial defects, seen in a large number of congenital disorders known as neurocristopathies as well as in more common non-syndromic manifestations such as cleft lip and palate [41,42].
To study gene regulatory mechanisms underlying development of this unique cell type, our laboratory has developed a hESC-based in vitro model that recapitulates specification, migration and maturation of the cranial neural crest . Recently, through epigenomic profiling we demonstrated that differentiation of hESC to human neural crest (hNCC) is accompanied by dramatic changes in enhancer chromatin landscapes (; figure 1). Over 4000 promoter-distal genomic elements were marked by an active chromatin enhancer signature (defined by the presence of coactivator p300 flanked by nucleosomes modified by H3K27ac and H3K4me1) in the hNCC, and these putative enhancers showed strong association with genomic regions implicated in craniofacial development and disease . Moreover, analysis of sequence motifs enriched at the annotated hNCC enhancers predicted major TFs that bind at these elements. We subsequently validated novel predictions coming out of our epigenomic profiling through in vitro and in vivo follow-up studies. For example, we demonstrated that nuclear receptors NR2F1 and NR2F2 (aka COUP-TF1 and COUP-TF2; ) are novel regulators of NC gene expression and craniofacial morphogenesis, which bind at a subset of NC enhancers along with a NC master regulator TFAP2A . Importantly, genomic regions with simultaneous co-occupancy of TFAP2A and NR2F1/2 are associated with permissive chromatin states, characterized by high levels of p300 and H3K27ac . These results suggest that, as has been reported in other systems [12,13], cooperative function of lineage specifiers (e.g. TFAP2A) and signalling effectors (e.g. NR2F1/F2) converges at active enhancer elements (figure 1). Here, we present new data that more directly illustrate cooperation between TFAP2A and NR2F1/F2 in NCCs, and discuss implications of these observations for studies of craniofacial variation.
3. Human genetic variation as a tool to investigate molecular mechanisms of enhancer function
As discussed above, cooperativity between TFs in DNA binding, coactivator recruitment and establishment of permissive chromatin states are important features of enhancer-mediated gene regulation. Unfortunately, analyses of such synergies are often confounded by the fact that major developmental TFs commonly regulate each other's expression and are essential for maintenance of the cell fate of interest, making loss-of-function studies difficult to interpret. Synergistic function of TFAP2A and NR2F1/F2 in NCCs, for example, is strongly suggested by their physical co-association and the observation that genomic regions co-bound by TFAP2A and NR2F1/2 are characterized by elevated levels of histone marks associated with active enhancers compared with regions bound by either factor alone . Though compelling, these observations are nonetheless only correlative. To complicate the matter, a more direct examination of potential cooperativity between TFAP2A and NR2F1/F2 is precluded by the observation that these factors control each other's expression, and depletion of either TF has a profound effect on the NC gene expression program , demonstrating the difficulty in dissecting direct versus indirect effects in knockdown studies.
These caveats motivated us to explore alternative methods for studying TF cooperativity in a developmentally dynamic system. To this end we decided to take advantage of the natural genetic variation that occurs in the human genome (figure 2). We hypothesized that if two TFs cooperate in binding to DNA and the establishment of permissive chromatin states, then a single nucleotide polymorphism (SNP) affecting DNA-binding affinity of one factor should in turn affect occupancy of the other factor, as well as chromatin state at the co-bound enhancer (figure 2). To test this hypothesis we analysed NCC enhancers bound by NR2F1/F2, as reported in Rada-Iglesias et al. . We then identified sequence variants falling within NR2F1/F2 recognition motifs at these enhancers, focusing on SNPs characterized by high heterozygosity in the human population. We selected 19 such common SNPs and genotyped them in a human H9 hESC line, revealing nine SNPs heterozygous in this genetic background. Using these heterozygous variants, we could directly compare biases in occupancy of TFs and histone modifications between two enhancer alleles within the same cell population (figure 2).
4. Single nucleotide polymorphisms can affect cooperative binding of transcription factors and chromatin states at enhancers
We differentiated H9 hESC to hNCC and performed chromatin immuno-precipitation (ChIP) analyses with NR2F1, NR2F2, TFAP2A, H3K27ac and H3K4me1 antibodies, followed by quantitative genotyping of enhancers harbouring aforementioned heterozygous SNPs in NR2F1/F2-binding motifs. In addition, we also analysed nucleosomal depletion at these regions using a FAIRE assay. For three of the nine SNPs we detected modest, but significant and reproducible allelic differences in binding of NR2F1 and NR2F2 (see figure 3, electronic supplementary material, figure S1). In each case stronger binding was associated with the allelic variant that more closely matched the recognition consensus of these nuclear receptors (note that NR2F1 and NR2F2 heterodimerize and share the same DNA binding motif [45,47]). Importantly, quantitative genotyping of TFAP2A ChIP DNA demonstrated that binding of TFAP2A is consistently increased at the same allele that shows preferential NR2F1/F2 enrichment, even though in each case both alleles harbour equivalent TFAP2A binding motifs, located within 100 bp of the investigated SNPs. These data suggests that analysed genetic variants indirectly affect TFAP2A occupancy via binding cooperativity with NR2F1/F2. Albeit we cannot formally exclude a possibility that elevated TFAP2A signal in ChIP assays results from the increased interaction with NR2F1/F2 (leading in turn in the enhanced formaldehyde cross-linkability) rather than a direct increase in TFAP2A DNA binding levels, the latter interpretation is supported by observation that the enhancer alleles displaying preferential TF binding were characterized by the stronger nucleosomal depletion in FAIRE assay (figure 3). Moreover, enrichment of H3K27ac, a chromatin mark closely correlated with enhancer activity [34,48], was also commonly increased at the enhancer alleles showing preferential TF binding, although the differences were not always statistically significant. In contrast, we did not detect allelic biases in H3K4me1 at any of the analysed enhancers. This result is consistent with our earlier observations that H3K4me1 enrichment often precedes enhancer activation and is not dependent on the presence of TFAP2A and/or NR2F1/F2 .
Taken together, our results further support a model in which cooperative function of TFAP2A and NR2F1/F2 (with likely input from other TFs) promotes establishment or maintenance of permissive chromatin states at NC enhancers. Analogous experimental strategies can be applied to investigate the functional relationships between major transcriptional regulators in other cell types and developmental processes, particularly in cases when loss of function studies are confounded by the caveats discussed above. The relationships between human genetic variation, TF binding, allele-specific chromatin states and gene expression are becoming an area of intense scientific interest, and several forays to characterize such relationships in a genome-wide manner have already been made [49–54]. For example, genome-wide occupancy measurements of 24 different TFs in a human lymphoblastoid cell line found that as much as 5 per cent of TF-binding sites show an allelic imbalance in occupancy. Importantly, the factors binding within these regions often exhibited a coordinated reaction to functional variants, and only approximately 12 per cent of allele-specific enrichments could be explained by sequence variants falling directly within binding motif for a given factor, suggesting cooperative effects at these regions . Furthermore, systematic computational approaches such as those used and experimentally validated for NF- κB have taken advantage of this functional genomic covariance to accurately predict co-associated TFs . Moreover, when coupled to high-throughput sequencing methods, analysis of allelic imbalances in chromatin signatures may provide a powerful, unbiased way to identify ‘strong’ and ‘weak’ functional enhancer variants, particularly in a heterozygous setting where even subtle allelic biases can be reliably quantified and measured. Several recent studies support the role of non-coding genetic diversity as a major driver of individual- and allele-specific chromatin states and uncover association of such variants with modulation in gene expression [51–54].
5. Molecular and phenotypic impact of genetic variation at enhancers
As discussed above, a single SNP can affect not only the binding of the TF whose motif is altered, but also binding of additional TFs that are recruited in cis to the same enhancer. Given the cooperative principles governing enhancer activity [2,5,9], a single SNP, therefore, has the potential to alter the overall enhancer state, as reflected by changes in TF binding, nucleosomal occupancy and histone modifications typical of active enhancers. We will hereafter refer to genetic variants that measurably affect enhancer states as enhancer-SNPs (eSNPs). eSNPs represent only a small subset of genetic variation within enhancers, because these polymorphisms not only have to affect DNA recognition of a TF critical for enhancer activity, but this occupancy change must not be compensated for by other factors or additional binding sites for the same TF . Even among eSNPs, those that dramatically alter enhancer activity are most likely rare. Instead, more subtle allelic biases in enhancer states are probably much more common. An outstanding question remains as to what extent such moderate changes in enhancer states elicit changes in level or timing of gene expression. Again, the answer will vary depending on the specific enhancer–promoter pair, on the degree of redundancy with other enhancers acting on the same target and on the environmental conditions, which have been shown to reveal essentiality for seemingly redundant enhancers [3,55,56]. Nonetheless, evidence emerges that cis-regulatory variation does result in transcriptional changes and that even subtle differences in gene expression level and spatio-temporal control can have important consequences for phenotypes [57–64]. In one example of particular relevance, a single SNP within a craniofacial-associated IRF6 enhancer element disrupting TFAP2A binding has already been shown to confer elevated risk for non-syndromic cleft lip .
Comparison of the developmental basis of body-pattern evolution in animal models shows that morphological variation is largely a product of quantitative and spatio-temporal changes in deployment of conserved regulatory networks [66,67]. Some variation in gene expression may arise from somatic mosaicism, environmental perturbation, epigenetic influences or simple biological stochasticity, but a significant proportion of phenotypic diversity is encoded within heritable genetic information. Direct genome sequencing, linkage analysis and genome-wide association studies reveal that much of this transmissible information responsible for trait modulation is buried within non-coding regions of the genome. This can be rationalized as a consequence of genetic pleiotropy, because mutations disrupting function within a coding region of an important developmental gene may confer widespread detrimental effects throughout a developing organism. As mentioned above, regulatory elements such as enhancers are thought instead to act in a highly tissue- and stage-specific manner, commonly in the context of other redundant or partially redundant regulatory elements. Variation arising within these non-coding regulatory regions is, therefore, less likely to be deleterious to the organism as a whole, rendering enhancers more accommodating of functional polymorphism on a population-wide level because they permit small changes in transcriptional regulation to arise without a massive reshuffling of developmental patterning networks.
In addition to their impact on intraspecies variation, eSNPs can also play an important role in speciation and evolutionary adaptation to changing environments. Since the discovery that coding regions of the genome remain largely conserved across species, it has long been postulated that evolutionary divergence mostly arises from quantitative (and spatio-temporal) rather than qualitative changes to gene function . Recent advances in sequencing technologies have substantiated this prediction that changes within regulatory elements are a major source of evolutionary divergence, even between closely related species [69–71]. Importantly, there is now theory and evidence to argue that the same expression perturbations driving intraspecies variation may ultimately be responsible for speciation and fixed interspecies divergence .
6. Genetic variation at neural crest regulatory elements: implications for human craniofacial diversity
One of the most interesting examples of intraspecies variation is the human face, a single feature that best distinguishes an individual while also connecting each of us to our broader ethnic and familial ancestry. Although craniofacial morphology is known to be highly heritable, genetic factors that underlie normal variation in human face and skull shape remain poorly understood . Craniofacial structures originate largely from the cranial neural crest, a highly developmentally plastic cell population, which arises in the anterior part of the nascent neural tube and forms the majority of bone, cartilage and connective tissue of the head and face. Interestingly, in contrast to endochondral bone formation in which bone develops from a cartilage template, recent research suggests that craniofacial bone and cartilage (which undergo largely intramembranous ossification) may represent independent tissue modules and, although they all derive from the NC, are controlled by different genes and form separate condensations [71,74–78]. This modularity, combined with developmental robustness imparted by the crest's plasticity, means that specific traits can be adjusted in a fairly autonomous manner while maintaining integration with surrounding structures. A quick glance at the spectrum of shapes and sizes of facial features throughout the human population demonstrates this potential for quantitative phenotype modulation and reflects a multifactorial genetic contribution. Facial morphology, therefore, provides a ripe and tractable model for investigating the link between genotype, molecular phenotype and trait modulation in a complex human developmental context.
We anticipate that enhancers acting within the NCCs and their derivatives during gestational development are major drivers of facial phenotypic diversity. In support of early developmental and NC-driven origins of facial variation, avian xenotransplantation studies show that the NC contains autonomous morphogenetic information that can coordinate with surrounding tissues to drive species-specific (and probably individual-specific) facial morphology . Indeed, manipulation of conserved signalling pathway effectors such as BMP4 in specific regions of avian crest-originated facial prominences is able to stimulate local proliferation domains that parallel evolutionary differences between species of Darwin finches and chick . This suggests that heterochronic and heterotonic changes within the early crest itself are sufficient to influence facial traits and underlines the necessity for establishing models of human neural crest development to better understand the regulatory control of facial morphogenesis.
Our studies of the human neural crest epigenome represent initial attempts to characterize a cis-regulatory repertoire relevant for early steps in the formation of the human face . Although we identified thousands of putative enhancers, we probably did not capture the full complexity of craniofacial cis-regulation, and therefore later stages of craniofacial development must also be considered in subsequent functional genomic analyses. In experiments described here, we took advantage of natural genetic variation to provide further support for cooperative function of TFAP2A and NR2F1/F2 at NC enhancers and supplied a proof-of-principle demonstrating that eSNPs exists within cranial neural crest enhancers. The future challenge will be not only to systematically identify such eSNPs in the human population, but to link them with diversity of specific aspects of human craniofacial morphologies.
This work was supported by NIH RO1 GM095555 and CIRM RB3-05100 grants for J.W. and Siebel Scholarship for A.R-I.
One contribution of 12 to a Discussion Meeting Issue ‘Regulation from a distance: long-range regulation of gene expression’.
- © 2013 The Author(s) Published by the Royal Society. All rights reserved.