Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation ?5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.
Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn's disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy.
Research into great ape genomes has revealed widely divergent activity levels over time for Alu elements. However, the diversity of this mobile element family in the genome of the western lowland gorilla has previously been uncharacterized. Alu elements are primate-specific short interspersed elements that have been used as phylogenetic and population genetic markers for more than two decades. Alu elements are present at high copy number in the genomes of all primates surveyed thus far. The AluY subfamily and its derivatives have been recognized as the evolutionarily youngest Alu subfamily in the Old World primate lineage.
Objectives: The purpose of this study was to assess the relationship between apolipoprotein E (APOE), life events and engagement, and subjective well-being (as measured by positive and negative affect) among centenarians. Based on the life stress paradigm, we predicted that higher levels of stress would allow APOE to influence positive and negative affect. Method: 196 centenarians and near-centenarians (98 years and older) of the Georgia Centenarian Study participated in this research. The APOE, positive and negative affect, the number of recent (last 2 years) and lifelong (more than 20 years prior to testing) events, as well as a number of life engagement tasks were assessed. Results: Results suggested that centenarians carrying the APOE ?4 allele rated lower in positive affect, the number of lifelong events, and in engaged lifestyle, when compared to centenarians without the APOE ?4 allele (t = 3.43, p < .01; t = 3.19, p < .01; and t = 2.33, p < .05, respectively). Blockwise multiple regressions indicated that the APOE ?4 predicted positive but not negative affect after controlling for demographics. Gene-environment interactions were obtained for the APOE ?4 and lifelong events, suggesting that carriers of the APOE ?4 allele had higher scores of negative affect after having experienced more events, whereas noncarriers had reduced negative affect levels after having experienced more events. Conclusion: APOE ?4 is directly related to positive affect and is related to negative affect in interaction with life events.
We analyzed 83 fully sequenced great ape genomes for mobile element insertions, predicting a total of 49,452 fixed and polymorphic Alu and long interspersed element 1 (L1) insertions not present in the human reference assembly and assigning each retrotransposition event to a different time point during great ape evolution. We used these homoplasy-free markers to construct a mobile element insertions-based phylogeny of humans and great apes and demonstrate their differential power to discern ape subspecies and populations. Within this context, we find a good correlation between L1 diversity and single-nucleotide polymorphism heterozygosity (r(2) = 0.65) in contrast to Alu repeats, which show little correlation (r(2) = 0.07). We estimate that the "rate" of Alu retrotransposition has differed by a factor of 15-fold in these lineages. Humans, chimpanzees, and bonobos show the highest rates of Alu accumulation--the latter two since divergence 1.5 Mya. The L1 insertion rate, in contrast, has remained relatively constant, with rates differing by less than a factor of three. We conclude that Alu retrotransposition has been the most variable form of genetic variation during recent human-great ape evolution, with increases and decreases occurring over very short periods of evolutionary time.
Alu retrotransposons are the most numerous and active mobile elements in humans, causing genetic disease and creating genomic diversity. Mobile element scanning (ME-Scan) enables comprehensive and affordable identification of mobile element insertions (MEI) using targeted high-throughput sequencing of multiplexed MEI junction libraries. In a single experiment, ME-Scan identifies nearly all AluYb8 and AluYb9 elements, with high sensitivity for both rare and common insertions, in 169 individuals of diverse ancestry. ME-Scan detects heterozygous insertions in single individuals with 91% sensitivity. Insertion presence or absence states determined by ME-Scan are 95% concordant with those determined by locus-specific PCR assays. By sampling diverse populations from Africa, South Asia, and Europe, we are able to identify 5799 Alu insertions, including 2524 novel ones, some of which occur in exons. Sub-Saharan populations and a Pygmy group in particular carry numerous intermediate-frequency Alu insertions that are absent in non-African groups. There is a significant dearth of exon-interrupting insertions among common Alu polymorphisms, but the density of singleton Alu insertions is constant across exonic and nonexonic regions. In one case, a validated novel singleton Alu interrupts a protein-coding exon of FAM187B. This implies that exonic Alu insertions are generally deleterious and thus eliminated by natural selection, but not so quickly that they cannot be observed as extremely rare variants.
The human retrotransposon with the highest copy number is the Alu element. The human genome contains over one million Alu elements that collectively account for over ten percent of our DNA. Full-length Alu elements are randomly distributed throughout the genome in both forward and reverse orientations. However, full-length widely spaced Alu pairs having two Alus in the same (direct) orientation are statistically more prevalent than Alu pairs having two Alus in the opposite (inverted) orientation. The cause of this phenomenon is unknown. It has been hypothesized that this imbalance is the consequence of anomalous inverted Alu pair interactions. One proposed mechanism suggests that inverted Alu pairs can ectopically interact, exposing both ends of each Alu element making up the pair to a potential double-strand break, or "hit". This hypothesized "two-hit" (two double-strand breaks) potential per Alu element was used to develop a model for comparing the relative instabilities of human genes. The model incorporates both 1) the two-hit double-strand break potential of Alu elements and 2) the probability of exon-damaging deletions extending from these double-strand breaks. This model was used to compare the relative instabilities of 50 deletion-prone cancer genes and 50 randomly selected genes from the human genome. The output of the Alu element-based genomic instability model developed here is shown to coincide with the observed instability of deletion-prone cancer genes. The 50 cancer genes are collectively estimated to be 58% more unstable than the randomly chosen genes using this model. Seven of the deletion-prone cancer genes, ATM, BRCA1, FANCA, FANCD2, MSH2, NCOR1 and PBRM1, were among the most unstable 10% of the 100 genes analyzed. This algorithm may lay the foundation for comparing genetic risks posed by structural variations that are unique to specific individuals, families and people groups.
Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo "clouds"). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%-69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (?25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed "element-specific" P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ?100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed.
The human genome contains approximately one million Alu elements which comprise more than 10% of human DNA by mass. Alu elements possess direction, and are distributed almost equally in positive and negative strand orientations throughout the genome. Previously, it has been shown that closely spaced Alu pairs in opposing orientation (inverted pairs) are found less frequently than Alu pairs having the same orientation (direct pairs). However, this imbalance has only been investigated for Alu pairs separated by 650 or fewer base pairs (bp) in a study conducted prior to the completion of the draft human genome sequence.
As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations.
Transposable elements (TEs) are a tremendous source of genome instability and genetic variation. Of particular interest to investigators of human biology and human evolution are retrotransposon insertions that are recent and/or polymorphic in the human population. As a consequence, the ability to assay large numbers of polymorphic TEs in a given genome is valuable. Five recent manuscripts each propose methods to scan whole human genomes to identify, map, and, in some cases, genotype polymorphic retrotransposon insertions in multiple human genomes simultaneously. These technologies promise to revolutionize our ability to analyze human genomes for TE-based variation important to studies of human variability and human disease. Furthermore, the approaches hold promise for researchers interested in nonhuman genomic variability. Herein, we explore the methods reported in the manuscripts and discuss their applications to aspects of human biology and the biology of other organisms.
Colobine monkeys constitute a diverse group of primates with major radiations in Africa and Asia. However, phylogenetic relationships among genera are under debate, and recent molecular studies with incomplete taxon-sampling revealed discordant gene trees. To solve the evolutionary history of colobine genera and to determine causes for possible gene tree incongruences, we combined presence/absence analysis of mobile elements with autosomal, X chromosomal, Y chromosomal and mitochondrial sequence data from all recognized colobine genera.
Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
Orang-utan is derived from a Malay term meaning man of the forest and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000?years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
Crocodylus is the largest genus within the Order Crocodylia consisting of eleven species. This paper reports the complete mitochondrial genome sequences of three Crocodylus species, Crocodylus moreletii, Crocodylus johnstoni and Crocodylus palustris, and compares the newly obtained mitochondrial DNA sequences with other crocodilians, available in the public databases. The mitochondrial genomes of C. moreletii, C. johnstoni and C. palustris are 16,827 bp, 16,851 bp and 16,852 bp in length, respectively. These mitochondrial genomes consist of 13 protein coding genes, two ribosomal RNA genes, 22 transfer RNA genes and a non-coding region. The mitochondrial genomes of all the Crocodylus species, studied herein show identical characteristics in terms of nucleotide composition and codon usage, suggestive of the existence of analogous evolutionary patterns within the genus, Crocodylus. The synonymous and non-synonymous substitution rates for all the protein coding genes of Crocodylus were observed in between 0.001 and 0.275 which reveal the prevalence of purifying selection in these genes. The phylogenetic analyses based on complete mitochondrial DNA data substantiate the previously established crocodilian phylogeny. This study provides a better understanding of the crocodilian mitochondrial genome and the data described herein will prove useful for future studies concerning crocodilian mitochondrial genome evolution.
The search for longevity-determining genes in human has largely neglected the operation of genetic interactions. We have identified a novel combination of common variants of three genes that has a marked association with human lifespan and healthy aging. Subjects were recruited and stratified according to their genetically inferred ethnic affiliation to account for population structure. Haplotype analysis was performed in three candidate genes, and the haplotype combinations were tested for association with exceptional longevity. An HRAS1 haplotype enhanced the effect of an APOE haplotype on exceptional survival, and a LASS1 haplotype further augmented its magnitude. These results were replicated in a second population. A profile of healthy aging was developed using a deficit accumulation index, which showed that this combination of gene variants is associated with healthy aging. The variation in LASS1 is functional, causing enhanced expression of the gene, and it contributes to healthy aging and greater survival in the tenth decade of life. Thus, rare gene variants need not be invoked to explain complex traits such as aging; instead rare congruence of common gene variants readily fulfills this role. The interaction between the three genes described here suggests new models for cellular and molecular mechanisms underlying exceptional survival and healthy aging that involve lipotoxicity.
The genus Crocodylus consists of 11 species including the largest living reptile, Crocodylus porosus. The current understanding of the intrageneric relationships between the members of the genus Crocodylus is sparse. Even though members of this genus have been included in many phylogenetic analyses, different molecular approaches have resulted in incongruent trees leaving the phylogenetic relationships among the members of Crocodylus unresolved inclusive of the placement of C. porosus. In this study, the complete mitochondrial genome sequences along with the partial mitochondrial gene sequences and a nuclear gene, C-mos were utilized to infer the intrageneric relationships among Crocodylus species with a special emphasis on the phylogenetic position of C. porosus. Four different phylogenetic methods, Neighbour Joining, Maximum Parsimony, Maximum Likelihood and Bayesian inference, were utilized to reconstruct the crocodilian phylogeny. The uncorrected pairwise distances computed in the study, show close proximity of C. porosus to C. siamensis and the tree topologies thus obtained, also consistently substantiated this relationship with a high statistical support. In addition, the relationship between C. acutus and C. intermedius was retained in all the analyses. The results of the current phylogenetic study support the well established intergeneric crocodilian phylogenetic relationships. Thus, this study proposes the sister relationship between C. porosus and C. siamensis and also suggests the close relationship of C. acutus to C. intermedius within the genus Crocodylus.
Mobile elements represent a unique and powerful set of tools for understanding the variation in a genome. Methods exist not only to utilize the polymorphisms among and within taxa to various ends but also to investigate the mechanism through which mobilization occurs. The number of methods to accomplish these ends is ever growing. Here, we present several protocols designed to assay mobile element-based variation within and among individual genomes.
Transposable elements (TE), defined as discrete pieces of DNA that can move from one site to another site in genomes, represent significant components of eukaryotic genomes, including primates. Comparative genome-wide analyses have revealed the considerable structural and functional impact of TE families on primate genomes. Insights into these questions have come in part from the development of computational methods that allow detailed and reliable identification, annotation, and evolutionary analyses of the many TE families that populate primate genomes. Here, we present an overview of these computational methods and describe efficient data mining strategies for providing a comprehensive picture of TE biology in newly available genome sequences.
Mobile elements (MEs) are diverse, common and dynamic inhabitants of nearly all genomes. ME transposition generates a steady stream of polymorphic genetic markers, deleterious and adaptive mutations, and substrates for further genomic rearrangements. Research on the impacts, population dynamics, and evolution of MEs is constrained by the difficulty of ascertaining rare polymorphic ME insertions that occur against a large background of pre-existing fixed elements and then genotyping them in many individuals.
L1s are one of the most successful autonomous mobile elements in primate genomes. These elements comprise as much as 17% of primate genomes with the majority of insertions occurring via target primed reverse transcription (TPRT). Twin priming, a variant of TPRT, can result in unusual DNA sequence architecture. These insertions appear to be inverted, truncated L1s flanked by target site duplications.
It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families - long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements - mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development.
The zebra finch is an important model organism in several fields with unique relevance to human neuroscience. Like other songbirds, the zebra finch communicates through learned vocalizations, an ability otherwise documented only in humans and a few other animals and lacking in the chicken-the only bird with a sequenced genome until now. Here we present a structural, functional and comparative analysis of the genome sequence of the zebra finch (Taeniopygia guttata), which is a songbird belonging to the large avian order Passeriformes. We find that the overall structures of the genomes are similar in zebra finch and chicken, but they differ in many intrachromosomal rearrangements, lineage-specific gene family expansions, the number of long-terminal-repeat-based retrotransposons, and mechanisms of sex chromosome dosage compensation. We show that song behaviour engages gene regulatory networks in the zebra finch brain, altering the expression of long non-coding RNAs, microRNAs, transcription factors and their targets. We also show evidence for rapid molecular evolution in the songbird lineage of genes that are regulated during song experience. These results indicate an active involvement of the genome in neural processes underlying vocal communication and identify potential genetic substrates for the evolution and regulation of this behaviour.
Allelic differences of chemokine (C-C motif ) receptor 5 (CCR5) and CCR2, as well as the ligand for the chemokine receptor CXCR4, stromal-derived factor (SDF-1), are known to suppress HIV-1 transmission and to be involved in delay in HIV-1 disease progression. The aim of our study was to investigate the frequencies of four mutations that confer resistance to HIV-1: CCR5-Delta32, CCR5-m303, CCR2-64I, and SDF1-3A among Bahrainis. We have studied the DNA polymorphisms in 304 unrelated healthy Bahraini individuals without any known history of HIV-1 infection or AIDS symptoms. The CCR5-Delta32 mutation was detected by PCR analysis, while the CCR5-m303, CCR2-64I, and SDF1-3A mutations were detected by PCR-restriction fragment length polymorphism (PCR-RFLP) tests. Allele frequencies and the fit to the Hardy-Weinberg equilibrium were evaluated using the Arlequin population genetics application. The frequencies of the CCR5-Delta32, CCR2-64I, and SDF1-3A alleles were 2.8%, 8.9%, and 26.5%, respectively. No mutant alleles were detected for the CCR5-m303 mutation in 304 individuals. We estimated the risk of AIDS onset (relative hazard), computed from the three-locus genotype data. This is the first report of these four mutations conferring resistance to HIV-1 in the Bahraini population. The presence of the CCR5-Delta32 allele among Bahrainis may be attributed to the admixture with people of European descent. The CCR2-64I allele and especially the SDF1-3A allele are predominant in the Bahraini population and may be associated with resistance to fast HIV-1 infection in Bahrainis, and thus their genotyping can be used for prognosis in HIV-infected individuals.
Their ability to move within genomes gives transposable elements an intrinsic propensity to affect genome evolution. Non-long terminal repeat (LTR) retrotransposons--including LINE-1, Alu and SVA elements--have proliferated over the past 80 million years of primate evolution and now account for approximately one-third of the human genome. In this Review, we focus on this major class of elements and discuss the many ways that they affect the human genome: from generating insertion mutations and genomic instability to altering gene expression and contributing to genetic innovation. Increasingly detailed analyses of human and other primate genomes are revealing the scale and complexity of the past and current contributions of non-LTR retrotransposons to genomic change in the human lineage.
Transposable elements (TEs) are an important source of genome diversity and play a crucial role in genome evolution. A recent study by Zhao et al. describes novel patterns of TE diversification in the genome of the extinct mammoth Mammuthus primigenius. Analysis of Mammuthus has provided a unique genome landscape, a pivotal species for understanding TEs and genome evolution and hints at the diversity we verge on discovering by expanding our taxonomic sampling among genomes. Strategies based on this work might also revolutionize investigations of the interface between TE dynamics and genome diversity.
Recombination rates vary widely across the human genome, but little of that variation is correlated with known DNA sequence features. The genome contains more than one million Alu mobile element insertions, and these insertions have been implicated in non-homologous recombination, modulation of DNA methylation, and transcriptional regulation. If individual Alu insertions have even modest effects on local recombination rates, they could collectively have a significant impact on the pattern of linkage disequilibrium in the human genome and on the evolution of the Alu family itself.
SVA elements represent the youngest family of hominid non-LTR retrotransposons, which alter the human genome continuously. They stand out due to their organization as composite repetitive elements. To draw conclusions on the assembly process that led to the current organization of SVA elements and on their transcriptional regulation, we initiated our study by assessing differences in structures of the 116 SVA elements located on human chromosome 19. We classified SVA elements into seven structural variants, including novel variants like 3-truncated elements and elements with 5-flanking sequence transductions. We established a genome-wide inventory of 5-transduced SVA elements encompassing approximately 8% of all human SVA elements. The diversity of 5 transduction events found indicates transcriptional control of their SVA source elements by a multitude of external cellular promoters in germ cells in the course of their evolution and suggests that SVA elements might be capable of acquiring 5 promoter sequences. Our data indicate that SVA-mediated 5 transduction events involve alternative RNA splicing at cryptic splice sites. We analyzed one remarkably successful human-specific SVA 5 transduction group in detail because it includes at least 32% of all SVA subfamily F members. An ancient retrotransposition event brought an SVA insertion under transcriptional control of the MAST2 gene promoter, giving rise to the primal source element of this group. Members of this group are currently transcribed. Here we show that SVA-mediated 5 transduction events lead to structural diversity of SVA elements and represent a novel source of genomic rearrangements contributing to genomic diversity.
The Alu family is a highly successful group of non-LTR retrotransposons ubiquitously found in primate genomes. Similar to the L1 retrotransposon family, Alu elements integrate primarily through an endonuclease-dependent mechanism termed target site-primed reverse transcription (TPRT). Recent studies have suggested that, in addition to TPRT, L1 elements occasionally utilize an alternative endonuclease-independent pathway for genomic integration. To determine whether an analogous mechanism exists for Alu elements, we have analyzed three publicly available primate genomes (human, chimpanzee and rhesus macaque) for endonuclease-independent recently integrated or lineage specific Alu insertions. We recovered twenty-three examples of such insertions and show that these insertions are recognizably different from classical TPRT-mediated Alu element integration. We suggest a role for this process in DNA double-strand break repair and present evidence to suggest its association with intra-chromosomal translocations, in-vitro RNA recombination (IVRR), and synthesis-dependent strand annealing (SDSA).
Structural variants (SVs) are common in the human genome. Because approximately half of the human genome consists of repetitive, transposable DNA sequences, it is plausible that these elements play an important role in generating SVs in humans. Sequencing of the diploid genome of one individual human (HuRef) affords us the opportunity to assess, for the first time, the impact of mobile elements on SVs in an individual in a thorough and unbiased fashion. In this study, we systematically evaluated more than 8000 SVs to identify mobile element-associated SVs as small as 100 bp and specific to the HuRef genome. Combining computational and experimental analyses, we identified and validated 706 mobile element insertion events (including Alu, L1, SVA elements, and nonclassical insertions), which added more than 305 kb of new DNA sequence to the HuRef genome compared with the Human Genome Project (HGP) reference sequence (hg18). We also identified 140 mobile element-associated deletions, which removed approximately 126 kb of sequence from the HuRef genome. Overall, approximately 10% of the HuRef-specific indels larger than 100 bp are caused by mobile element-associated events. More than one-third of the insertion/deletion events occurred in genic regions, and new Alu insertions occurred in exons of three human genes. Based on the number of insertions and the estimated time to the most recent common ancestor of HuRef and the HGP reference genome, we estimated the Alu, L1, and SVA retrotransposition rates to be one in 21 births, 212 births, and 916 births, respectively. This study presents the first comprehensive analysis of mobile element-related structural variants in the complete DNA sequence of an individual and demonstrates that mobile elements play an important role in generating inter-individual structural variation.
Genus Macaca (Cercopithecidae: Papionini) is one of the most successful primate radiations. Despite previous studies on morphology and mitochondrial DNA analysis, a number of issues regarding the details of macaque evolution remain unsolved. Alu elements are a class of non-autonomous retroposons belonging to short interspersed elements that are specific to the primate lineage. Because retroposon insertions show very little homoplasy, and because the ancestral state (absence of the SINE) is known, Alu elements are useful genetic markers and have been utilized for analyzing primate phylogenentic relationships and human population genetic relationships. Using PCR display methodology, 298 new Alu insertions have been identified from ten species of macaques. Together with 60 loci reported previously, a total of 358 loci are used to infer the phylogenetic relationships of genus Macaca. With regard to earlier unresolved issues on the macaque evolution, the topology of our tree suggests that: 1) genus Macaca contains four monophyletic species groups; 2) within the Asian macaques, the silenus group diverged first, and members of the sinica and fascicularis groups share a common ancestor; 3) Macaca arctoides are classified in the sinica group. Our results provide a robust molecular phylogeny for genus Macaca with stronger statistical support than previous studies. The present study also illustrates that SINE-based approaches are a powerful tool in primate phylogenetic studies and can be used to successfully resolve evolutionary relationships between taxa at scales from the ordinal level to closely related species within one genus.
Retrotransposons, specifically Alu and L1 elements, have been especially successful in their expansion throughout primate genomes. While most of these elements integrate through an endonuclease-mediated process termed target primed reverse transcription, a minority integrate using alternative methods. Here we present evidence for one such mechanism, which we term internal priming and demonstrate that loci integrating through this mechanism are qualitatively different from "classical" insertions. Previous examples of this mechanism are limited to cell culture assays, which show that reverse transcription can initiate upstream of the 3 poly-A tail during retrotransposon integration. To detect whether this mechanism occurs in vivo as well as in cell culture, we have analyzed the human genome for internal priming events using recently integrated L1 and Alu elements. Our examination of the human genome resulted in the recovery of twenty events involving internal priming insertions, which are structurally distinct from both classical TPRT-mediated insertions and non-classical insertions. We suggest two possible mechanisms by which these internal priming loci are created and provide evidence supporting a role in staggered DNA double-strand break repair. Also, we demonstrate that the internal priming process is associated with inter-chromosomal duplications and the insertion of filler DNA.
DNA double-strand breaks (DSBs) are a common form of cellular damage that can lead to cell death if not repaired promptly. Experimental systems have shown that DSB repair in eukaryotic cells is often imperfect and may result in the insertion of extra chromosomal DNA or the duplication of existing DNA at the breakpoint. These events are thought to be a source of genomic instability and human diseases, but it is unclear whether they have contributed significantly to genome evolution. Here we developed an innovative computational pipeline that takes advantage of the repetitive structure of genomes to detect repair-mediated duplication events (RDs) that occurred in the germline and created insertions of at least 50 bp of genomic DNA. Using this pipeline we identified over 1,000 probable RDs in the human genome. Of these, 824 were intra-chromosomal, closely linked duplications of up to 619 bp bearing the hallmarks of the synthesis-dependent strand-annealing repair pathway. This mechanism has duplicated hundreds of sequences predicted to be functional in the human genome, including exons, UTRs, intron splice sites and transcription factor binding sites. Dating of the duplication events using comparative genomics and experimental validation revealed that the mechanism has operated continuously but with decreasing intensity throughout primate evolution. The mechanism has produced species-specific duplications in all primate species surveyed and is contributing to genomic variation among humans. Finally, we show that RDs have also occurred, albeit at a lower frequency, in non-primate mammals and other vertebrates, indicating that this mechanism has been an important force shaping vertebrate genome evolution.
The angiotensin-converting enzyme (ACE) gene in humans has an insertion-deletion (I/D) polymorphic state in intron 16 on chromosome 17q23. This polymorphism has been widely investigated in different populations due to its association with the renin-angiotensin system. However, similar studies for Arab populations are limited. This study addresses the distribution of the ACE gene polymorphism in three Arab populations (Egyptians, Jordanians and Syrians).
The human platelet alloantigen system HPA-1 in the Egyptian population was examined by polymerase chain reaction using sequence-specific primers (PCR-SSP). The objectives of this study were to evaluate the allele frequency of HPA-1a and -1b in healthy Egyptian individuals and compare these with the international literature. Human platelet antigen (HPA) systems are associated with alloimmunization and organ transplantation rejection as well as the development of cardiovascular disease. Of the various HPA systems, HPA-1 specifically has been considered to be the most important antigenic system implicated in the Caucasian population. No study has yet examined this system in the Egyptian populations, however. We therefore investigated the allele frequency of the HPA-1 system in the Egyptian population.
The third international conference on the genomic impact of eukaryotic transposable elements (TEs) was held 24 to 28 February 2012 at the Asilomar Conference Center, Pacific Grove, CA, USA. Sponsored in part by the National Institutes of Health grant 5 P41 LM006252, the goal of the conference was to bring together researchers from around the world who study the impact and mechanisms of TEs using multiple computational and experimental approaches. The meeting drew close to 170 attendees and included invited floor presentations on the biology of TEs and their genomic impact, as well as numerous talks contributed by young scientists. The workshop talks were devoted to computational analysis of TEs with additional time for discussion of unresolved issues. Also, there was ample opportunity for poster presentations and informal evening discussions. The success of the meeting reflects the important role of Repbase in comparative genomic studies, and emphasizes the need for close interactions between experimental and computational biologists in the years to come.
LEMURS (INFRAORDER: Lemuriformes) are a radiation of strepsirrhine primates endemic to the island of Madagascar. As of 2012, 101 lemur species, divided among five families, have been described. Genetic and morphological evidence indicates all species are descended from a common ancestor that arrived in Madagascar ?55-60 million years ago (mya). Phylogenetic relationships in this species-rich infraorder have been the subject of debate. Here we use Alu elements, a family of primate-specific Short INterspersed Elements (SINEs), to construct a phylogeny of infraorder Lemuriformes. Alu elements are particularly useful SINEs for the purpose of phylogeny reconstruction because they are identical by descent and confounding events between loci are easily resolved by sequencing. The genome of the grey mouse lemur (Microcebus murinus) was computationally assayed for synapomorphic Alu elements. Those that were identified as Lemuriformes-specific were analyzed against other available primate genomes for orthologous sequence in which to design primers for PCR (polymerase chain reaction) verification. A primate phylogenetic panel of 24 species, including 22 lemur species from all five families, was examined for the presence/absence of 138 Alu elements via PCR to establish relationships among species. Of these, 111 were phylogenetically informative. A phylogenetic tree was generated based on the results of this analysis. We demonstrate strong support for the monophyly of Lemuriformes to the exclusion of other primates, with Daubentoniidae, the aye-aye, as the basal lineage within the infraorder. Our results also suggest Lepilemuridae as a sister lineage to Cheirogaleidae, and Indriidae as sister to Lemuridae. Among the Cheirogaleidae, we show strong support for Microcebus and Mirza as sister genera, with Cheirogaleus the sister lineage to both. Our results also support the monophyly of the Lemuridae. Within Lemuridae we place Lemur and Hapalemur together to the exclusion of Eulemur and Varecia, with Varecia the sister lineage to the other three genera.
Gibbons (Hylobatidae) are small, arboreal apes indigenous to Southeast Asia that diverged from other apes ?15-18 Ma. Extant lineages radiated rapidly 6-10 Ma and are organized into four genera (Hylobates, Hoolock, Symphalangus, and Nomascus) consisting of 12-19 species. The use of short interspersed elements (SINEs) as phylogenetic markers has seen recent popularity due to several desirable characteristics: the ancestral state of a locus is known to be the absence of an element, rare potentially homoplasious events are relatively easy to resolve, and samples can be quickly and inexpensively genotyped. During radiation of primates, one particular family of SINEs, the Alu family, has proliferated in primate genomes. Nomascus leucogenys (northern white-cheeked gibbon) sequences were analyzed for repetitive content with RepeatMasker using a custom library. The sequences containing Alu elements identified as members of a gibbon-specific subfamily were then compared with orthologous positions in other primate genomes. A primate phylogenetic panel consisting of 18 primate species, including 13 gibbon species representing all four extant genera, was assayed for all loci, and a total of 125 gibbon-specific Alu insertions were identified. The resulting amplification patterns were used to generate a phylogenetic tree. We demonstrate significant support for Symphalangus as the most basal lineage within the family. Our findings also place Nomascus as a derived lineage, sister to Hoolock, with the Nomascus-Hoolock clade sister to Hylobates. Further, our analysis groups N. leucogenys and Nomascus siki as sister taxa to the exclusion of the other Nomascus species assayed. This study represents the first use of SINEs to determine the genus level phylogenetic relationships within the family Hylobatidae. These relationships have been resolved with robust support at most internal nodes, demonstrating the utility of SINE-based phylogenetic analysis. We postulate that hybridization and rapid radiation may have contributed to the complex and contradictory findings of the previous studies. Our findings will aid in the conservation of these threatened primates and inform future studies of the biogeographical history and distribution of modern gibbon species.
Gibbons (Hylobatidae) shared a common ancestor with the other hominoids only 15-18 million years ago. Nevertheless, gibbons show very distinctive features that include heavily rearranged chromosomes. Previous observations indicate that this phenomenon may be linked to the attenuated epigenetic repression of transposable elements (TEs) in gibbon species. Here we describe the massive expansion of a repeat in almost all the centromeres of the eastern hoolock gibbon (Hoolock leuconedys). We discovered that this repeat is a new composite TE originating from the combination of portions of three other elements (L1ME5, AluSz6, and SVA_A) and thus named it LAVA. We determined that this repeat is found in all the gibbons but does not occur in other hominoids. Detailed investigation of 46 different LAVA elements revealed that the majority of them have target site duplications (TSDs) and a poly-A tail, suggesting that they have been retrotransposing in the gibbon genome. Although we did not find a direct correlation between the emergence of LAVA elements and human-gibbon synteny breakpoints, this new composite transposable element is another mark of the great plasticity of the gibbon genome. Moreover, the centromeric expansion of LAVA insertions in the hoolock closely resembles the massive centromeric expansion of the KERV-1 retroelement reported for wallaby (marsupial) interspecific hybrids. The similarity between the two phenomena is consistent with the hypothesis that evolution of the gibbons is characterized by defects in epigenetic repression of TEs, perhaps triggered by interspecific hybridization.
Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan) lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.