Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation ?5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.
Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn's disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy.
We analyzed 83 fully sequenced great ape genomes for mobile element insertions, predicting a total of 49,452 fixed and polymorphic Alu and long interspersed element 1 (L1) insertions not present in the human reference assembly and assigning each retrotransposition event to a different time point during great ape evolution. We used these homoplasy-free markers to construct a mobile element insertions-based phylogeny of humans and great apes and demonstrate their differential power to discern ape subspecies and populations. Within this context, we find a good correlation between L1 diversity and single-nucleotide polymorphism heterozygosity (r(2) = 0.65) in contrast to Alu repeats, which show little correlation (r(2) = 0.07). We estimate that the "rate" of Alu retrotransposition has differed by a factor of 15-fold in these lineages. Humans, chimpanzees, and bonobos show the highest rates of Alu accumulation--the latter two since divergence 1.5 Mya. The L1 insertion rate, in contrast, has remained relatively constant, with rates differing by less than a factor of three. We conclude that Alu retrotransposition has been the most variable form of genetic variation during recent human-great ape evolution, with increases and decreases occurring over very short periods of evolutionary time.
The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.
The human genome contains approximately one million Alu elements which comprise more than 10% of human DNA by mass. Alu elements possess direction, and are distributed almost equally in positive and negative strand orientations throughout the genome. Previously, it has been shown that closely spaced Alu pairs in opposing orientation (inverted pairs) are found less frequently than Alu pairs having the same orientation (direct pairs). However, this imbalance has only been investigated for Alu pairs separated by 650 or fewer base pairs (bp) in a study conducted prior to the completion of the draft human genome sequence.
As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations.
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
Orang-utan is derived from a Malay term meaning man of the forest and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000?years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
The search for longevity-determining genes in human has largely neglected the operation of genetic interactions. We have identified a novel combination of common variants of three genes that has a marked association with human lifespan and healthy aging. Subjects were recruited and stratified according to their genetically inferred ethnic affiliation to account for population structure. Haplotype analysis was performed in three candidate genes, and the haplotype combinations were tested for association with exceptional longevity. An HRAS1 haplotype enhanced the effect of an APOE haplotype on exceptional survival, and a LASS1 haplotype further augmented its magnitude. These results were replicated in a second population. A profile of healthy aging was developed using a deficit accumulation index, which showed that this combination of gene variants is associated with healthy aging. The variation in LASS1 is functional, causing enhanced expression of the gene, and it contributes to healthy aging and greater survival in the tenth decade of life. Thus, rare gene variants need not be invoked to explain complex traits such as aging; instead rare congruence of common gene variants readily fulfills this role. The interaction between the three genes described here suggests new models for cellular and molecular mechanisms underlying exceptional survival and healthy aging that involve lipotoxicity.
Mobile elements represent a unique and powerful set of tools for understanding the variation in a genome. Methods exist not only to utilize the polymorphisms among and within taxa to various ends but also to investigate the mechanism through which mobilization occurs. The number of methods to accomplish these ends is ever growing. Here, we present several protocols designed to assay mobile element-based variation within and among individual genomes.
The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken-the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.
Gibbons (Hylobatidae) shared a common ancestor with the other hominoids only 15-18 million years ago. Nevertheless, gibbons show very distinctive features that include heavily rearranged chromosomes. Previous observations indicate that this phenomenon may be linked to the attenuated epigenetic repression of transposable elements (TEs) in gibbon species. Here we describe the massive expansion of a repeat in almost all the centromeres of the eastern hoolock gibbon (Hoolock leuconedys). We discovered that this repeat is a new composite TE originating from the combination of portions of three other elements (L1ME5, AluSz6, and SVA_A) and thus named it LAVA. We determined that this repeat is found in all the gibbons but does not occur in other hominoids. Detailed investigation of 46 different LAVA elements revealed that the majority of them have target site duplications (TSDs) and a poly-A tail, suggesting that they have been retrotransposing in the gibbon genome. Although we did not find a direct correlation between the emergence of LAVA elements and human-gibbon synteny breakpoints, this new composite transposable element is another mark of the great plasticity of the gibbon genome. Moreover, the centromeric expansion of LAVA insertions in the hoolock closely resembles the massive centromeric expansion of the KERV-1 retroelement reported for wallaby (marsupial) interspecific hybrids. The similarity between the two phenomena is consistent with the hypothesis that evolution of the gibbons is characterized by defects in epigenetic repression of TEs, perhaps triggered by interspecific hybridization.
Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan) lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.