Human populations have experienced dramatic growth since the Neolithic revolution. Recent studies that sequenced a very large number of individuals observed an extreme excess of rare variants and provided clear evidence of recent rapid growth in effective population size, although estimates have varied greatly among studies. All these studies were based on protein-coding genes, in which variants are also impacted by natural selection. In this study, we introduce targeted sequencing data for studying recent human history with minimal confounding by natural selection. We sequenced loci far from genes that meet a wide array of additional criteria such that mutations in these loci are putatively neutral. As population structure also skews allele frequencies, we sequenced 500 individuals of relatively homogeneous ancestry by first analyzing the population structure of 9,716 European Americans. We used very high coverage sequencing to reliably call rare variants and fit an extensive array of models of recent European demographic history to the site frequency spectrum. The best-fit model estimates ?3.4% growth per generation during the last ?140 generations, resulting in a population size increase of two orders of magnitude. This model fits the data very well, largely due to our observation that assumptions of more ancient demography can impact estimates of recent growth. This observation and results also shed light on the discrepancy in demographic estimates among recent studies.
This study addresses the question of how purifying selection operates during recent rapid population growth such as has been experienced by human populations. This is not a straightforward problem because the human population is not at equilibrium: population genetics predicts that, on the one hand, the efficacy of natural selection increases as population size increases, eliminating ever more weakly deleterious variants; on the other hand, a larger number of deleterious mutations will be introduced into the population and will be more likely to increase in their number of copies as the population grows. To understand how patterns of human genetic variation have been shaped by the interaction of natural selection and population growth, we examined the trajectories of mutations with varying selection coefficients, using computer simulations. We observed that while population growth dramatically increases the number of deleterious segregating sites in the population, it only mildly increases the number carried by each individual. Our simulations also show an increased efficacy of natural selection, reflected in a higher fraction of deleterious mutations eliminated at each generation and a more efficient elimination of the most deleterious ones. As a consequence, while each individual carries a larger number of deleterious alleles than expected in the absence of growth, the average selection coefficient of each segregating allele is less deleterious. Combined, our results suggest that the genetic risk of complex diseases in growing populations might be distributed across a larger number of more weakly deleterious rare variants.
Copy number variants (CNVs) are increasingly acknowledged as an important source of evolutionary novelties in the human lineage. However, our understanding of their significance is still hindered by the lack of primate CNV data. We performed intraspecific comparative genomic hybridizations to identify loci harboring copy number variants in each of the four great apes: bonobos, chimpanzees, gorillas, and orangutans. For the first time, we could analyze differences in CNV location and frequency in these four species, and compare them with human CNVs and primate segmental duplication (SD) maps. In addition, for bonobo and gorilla, patterns of CNV and nucleotide diversity were studied in the same individuals. We show that CNVs have been subject to different selective pressures in different lineages. Evidence for purifying selection is stronger in gorilla CNVs overlapping genes, while positive selection appears to have driven the fixation of structural variants in the orangutan lineage. In contrast, chimpanzees and bonobos present high levels of common structural polymorphism, which is indicative of relaxed purifying selection together with the higher mutation rates induced by the known burst of segmental duplication in the ancestor of the African apes. Indeed, the impact of the duplication burst is noticeable by the fact that bonobo and chimpanzee share more CNVs with gorilla than expected. Finally, we identified a number of interesting genomic regions that present high-frequency CNVs in all great apes, while containing only very rare or even pathogenic structural variants in humans.
Orang-utan is derived from a Malay term meaning man of the forest and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000?years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization.
Allelic specific gene expression (ASGE) appears to be an important factor in human phenotypic variability and as a consequence, for the development of complex traits and diseases. In order to study ASGE across the human genome, we have performed a study in which genotyping was coupled with an analysis of ASGE by screening 11,500 SNPs using the Mapping 10 K Array to identify differential allelic expression. We found that from the 5,133 SNPs that were suitable for analysis (heterozygous in our sample and expressed in peripheral blood mononuclear cells), 2,934 (57%) SNPs had differential allelic expression. Such SNPs were equally distributed along human chromosomes and biological processes. We validated the presence or absence of ASGE in 18 out 20 SNPs (90%) randomly selected by real time PCR in 48 human subjects. In addition, we observed that SNPs close to -but not included in- segmental duplications had increased levels of ASGE. Finally, we found that transcripts of unknown function or non-coding RNAs, also display ASGE: from a total of 2,308 intronic SNPs, 1510 (65%) SNPs underwent differential allelic expression. In summary, ASGE is a widespread mechanism in the human genome whose regulation seems to be far more complex than expected.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.