-
[FORUM] Don't throw the baby out with the bathwater: Enabling a bottom-up approach in genome-wide association studies
-
[REVIEW] Defensins and the dynamic genome: What we can learn from structural variation at human chromosome band 8p23.1
Over the past four years, genome-wide studies have uncovered numerous examples of structural variation in the human genome. This includes structural variation that changes copy number, such as deletion and duplication, and structural variation that does not change copy number, such as orientation and positional polymorphism. One region that contains all these types of variation spans the chromosome band 8p23.1. This region has been studied in some depth, and the focus of this review is to examine our current understanding of the variation of this region. We also consider whether this region is a good model for other structurally variable regions in the genome and what the implications of this variation are for clinical studies. Finally, we discuss the bioinformatics challenges raised, discuss the evolution of the region, and suggest some future priorities for structural variation research.
-
[ARTICLES] Copy number variation and evolution in humans and chimpanzees
Copy number variants (CNVs) underlie many aspects of human phenotypic diversity and provide the raw material for gene duplication and gene family expansion. However, our understanding of their evolutionary significance remains limited. We performed comparative genomic hybridization on a single human microarray platform to identify CNVs among the genomes of 30 humans and 30 chimpanzees as well as fixed copy number differences between species. We found that human and chimpanzee CNVs occur in orthologous genomic regions far more often than expected by chance and are strongly associated with the presence of highly homologous intrachromosomal segmental duplications. By adapting population genetic analyses for use with copy number data, we identified functional categories of genes that have likely evolved under purifying or positive selection for copy number changes. In particular, duplications and deletions of genes with inflammatory response and cell proliferation functions may have been fixed by positive selection and involved in the adaptive phenotypic differentiation of humans and chimpanzees.
-
[LETTERS] Reduced purifying selection prevails over positive selection in human copy number variant evolution
Copy number variation is a dominant contributor to genomic variation and may frequently underlie an individual’s variable susceptibilities to disease. Here we question our previous proposition that copy number variants (CNVs) are often retained in the human population because of their adaptive benefit. We show that genic biases of CNVs are best explained, not by positive selection, but by reduced efficiency of selection in eliminating deleterious changes from the human population. Of four CNV data sets examined, three exhibit significant increases in protein evolutionary rates. These increases appear to be attributable to the frequent coincidence of CNVs with segmental duplications (SDs) that recombine infrequently. Furthermore, human orthologs of mouse genes, which, when disrupted, result in pre- or postnatal lethality, are unusually depleted in CNVs. Together, these findings support a model of reduced purifying selection (Hill–Robertson interference) within copy number variable regions that are enriched in nonessential genes, allowing both the fixation of slightly deleterious substitutions and increased drift of CNV alleles. Additionally, all four CNV sets exhibited increased rates of interspecies chromosomal rearrangement and nucleotide substitution and an increased gene density. We observe that sequences with high G+C contents are most prone to copy number variation. In particular, frequently duplicated human SD sequence, or CNVs that are large and/or observed frequently, tend to be elevated in G+C content. In contrast, SD sequences that appear fixed in the human population lie more frequently within low G+C sequence. These findings provide an overarching view of how CNVs arise and segregate in the human population.
-
[LETTERS] Copy number variation at the breakpoint region of isochromosome 17q
Isochromosome 17q, or i(17q), is one of the most frequent nonrandom changes occurring in human neoplasia. Most of the i(17q) breakpoints cluster within a ~240-kb interval located in the Smith-Magenis syndrome common deletion region in 17p11.2. The breakpoint cluster region is characterized by a complex architecture with large (~38–49 kb), inverted and directly oriented, low-copy repeats (LCRs), known as REPA and REPB that apparently lead to genomic instability and facilitate somatic genetic rearrangements. Through the analysis of bacterial artificial chromosome (BAC) clones, pulsed-field gel electrophoresis (PFGE), and public array comparative genomic hybridization (array CGH) data, we show that the REPA/B structure is also susceptible to frequent meiotic rearrangements. It is a highly dynamic genomic region undergoing deletions, inversions, and duplications likely produced by non-allelic homologous recombination (NAHR) mediated by the highly identical SNORD3@, also known as U3, gene cluster present therein. We detected at least seven different REPA/B structures in samples from 29 individuals of which six represented potentially novel structures. Two polymorphic copy-number variation (CNV) variants, detected in 20% of samples, could be structurally described along with the likely underlying molecular mechanism for formation. Our data show the high susceptibility to rearrangements at the i(17q) breakpoint cluster region in the general population and exemplifies how large genomic regions laden with LCRs still represent a technical challenge for both determining specific structure and assaying population variation. The variant REPA/B structures identified may have different susceptibilities for inducing i(17q), thus potentially representing important risk alleles for tumor progression.
-
[LETTERS] Unexpected complexity at breakpoint junctions in phenotypically normal individuals and mechanisms involved in generating balanced translocations t(1;22)(p36;q13)
Approximately one in 500 individuals carries a reciprocal translocation. Balanced translocations are usually associated with a normal phenotype unless the translocation breakpoints disrupt a gene(s) or cause a position effect. We investigated breakpoint junctions at the sequence level in phenotypically normal balanced translocation carriers. Eight breakpoint junctions derived from four nonrelated subjects with apparently balanced translocation t(1;22)(p36;q13) were examined. Additions of nucleotides, deletions, duplications, and a triplication identified at the breakpoints demonstrate high complexity at the breakpoint junctions and indicate involvement of multiple mechanisms in the DNA breakage and repair process during translocation formation. Possible detailed nonhomologous end-joining scenarios for t(1;22) cases are presented. We propose that cryptic imbalances in phenotypically normal, balanced translocation carriers may be more common than currently appreciated.
-
[LETTERS] Dispensability of mammalian DNA
In the lab, the cis-regulatory network seems to exhibit great functional redundancy. Many experiments testing enhancer activity of neighboring cis-regulatory elements show largely overlapping expression domains. Of recent interest, mice in which cis-regulatory ultraconserved elements were knocked out showed no obvious phenotype, further suggesting functional redundancy. Here, we present a global evolutionary analysis of mammalian conserved nonexonic elements (CNEs), and find strong evidence to the contrary. Given a set of CNEs conserved between several mammals, we characterize functional dispensability as the propensity for the ancestral element to be lost in mammalian species internal to the spanned species tree. We show that ultraconserved-like elements are over 300-fold less likely than neutral DNA to have been lost during rodent evolution. In fact, many thousands of noncoding loci under purifying selection display near uniform indispensability during mammalian evolution, largely irrespective of nucleotide conservation level. These findings suggest that many genomic noncoding elements possess functions that contribute noticeably to organism fitness in naturally evolving populations.
-
[LETTERS] Evolution of the mammalian transcription factor binding repertoire via transposable elements
Identification of lineage-specific innovations in genomic control elements is critical for understanding transcriptional regulatory networks and phenotypic heterogeneity. We analyzed, from an evolutionary perspective, the binding regions of seven mammalian transcription factors (ESR1, TP53, MYC, RELA, POU5F1, SOX2, and CTCF) identified on a genome-wide scale by different chromatin immunoprecipitation approaches and found that only a minority of sites appear to be conserved at the sequence level. Instead, we uncovered a pervasive association with genomic repeats by showing that a large fraction of the bona fide binding sites for five of the seven transcription factors (ESR1, TP53, POU5F1, SOX2, and CTCF) are embedded in distinctive families of transposable elements. Using the age of the repeats, we established that these repeat-associated binding sites (RABS) have been associated with significant regulatory expansions throughout the mammalian phylogeny. We validated the functional significance of these RABS by showing that they are over-represented in proximity of regulated genes and that the binding motifs within these repeats have undergone evolutionary selection. Our results demonstrate that transcriptional regulatory networks are highly dynamic in eukaryotic genomes and that transposable elements play an important role in expanding the repertoire of binding sites.
-
[LETTERS] E2F in vivo binding specificity: Comparison of consensus versus nonconsensus binding sites
We have previously shown that most sites bound by E2F family members in vivo do not contain E2F consensus motifs. However, differences between in vivo target sites that contain or lack a consensus E2F motif have not been explored. To understand how E2F binding specificity is achieved in vivo, we have addressed how E2F family members are recruited to core promoter regions that lack a consensus motif and are excluded from other regions that contain a consensus motif. Using chromatin immunoprecipitation coupled with DNA microarray analysis (ChIP-chip) assays, we have shown that the predominant factors specifying whether E2F is recruited to an in vivo binding site are (1) the site must be in a core promoter and (2) the region must be utilized as a promoter in that cell type. We have tested three models for recruitment of E2F to core promoters lacking a consensus site, including (1) indirect recruitment, (2) looping to the core promoter mediated by an E2F bound to a distal motif, and (3) assisted binding of E2F to a site that weakly resembles an E2F motif. To test these models, we developed a new in vivo assay, termed eChIP, which allows analysis of transcription factor binding to isolated fragments. Our findings suggest that in vivo (1) a consensus motif is not sufficient to recruit E2Fs, (2) E2Fs can bind to isolated regions that lack a consensus motif, and (3) binding can require regions other than the best match to the E2F motif.
-
[LETTERS] Reconfiguration of genomic anchors upon transcriptional activation of the human major histocompatibility complex
The folding of chromatin into topologically constrained loop domains is essential for genomic function. We have identified genomic anchors that define the organization of chromatin loop domains across the human major histocompatibility complex (MHC). This locus contains critical genes for immunity and is associated with more diseases than any other region of the genome. Classical MHC genes are expressed in a cell type-specific pattern and can be induced by cytokines such as interferon-gamma (IFNG). Transcriptional activation of the MHC was associated with a reconfiguration of chromatin architecture resulting from the formation of additional genomic anchors. These findings suggest that the dynamic arrangement of genomic anchors and loops plays a role in transcriptional regulation.
|
|