The Paramecium aurelia complex is a group of 15 species that share at least three past whole-genome duplications (WGDs). The macronuclear genome sequences of P. biaurelia and P. sexaurelia are presented and compared to the published sequence of P. tetraurelia. Levels of duplicate-gene retention from the recent WGD differ by > 10% across species, with P. sexaurelia losing significantly more genes than P. biaurelia or P. tetraurelia. In addition, historically high rates of gene conversion have homogenized WGD paralogs, probably extending the paralogs' lifetimes. The probability of duplicate retention is positively correlated with GC content and expression level; ribosomal proteins, transcription factors, and intracellular signaling proteins are overrepresented among maintained duplicates. Finally, multiple sources of evidence indicate that P. sexaurelia diverged from the two other lineages immediately following, or perhaps concurrent with, the recent WGD, with approximately half of gene losses between P. tetraurelia and P. sexaurelia representing divergent gene resolutions (i.e., silencing of alternative paralogs), as expected for random duplicate loss between these species. Additionally, though P. biaurelia and P. tetraurelia diverged from each other much later, there are still more than 100 cases of divergent resolution between these two species. Taken together, these results indicate that divergent resolution of duplicate genes between lineages acts to reinforce reproductive isolation between species in the Paramecium aurelia complex.
Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event.
In the ciliate Paramecium, transposable elements and their single-copy remnants are deleted during the development of somatic macronuclei from germline micronuclei, at each sexual generation. Deletions are targeted by scnRNAs, small RNAs produced from the germ line during meiosis that first scan the maternal macronuclear genome to identify missing sequences, and then allow the zygotic macronucleus to reproduce the same deletions. Here we show that this process accounts for the maternal inheritance of mating types in Paramecium tetraurelia, a long-standing problem in epigenetics. Mating type E depends on expression of the transmembrane protein mtA, and the default type O is determined during development by scnRNA-dependent excision of the mtA promoter. In the sibling species Paramecium septaurelia, mating type O is determined by coding-sequence deletions in a different gene, mtB, which is specifically required for mtA expression. These independently evolved mechanisms suggest frequent exaptation of the scnRNA pathway to regulate cellular genes and mediate transgenerational epigenetic inheritance of essential phenotypic polymorphisms.
Accurate transmission and expression of genetic information are crucial for the survival of all living organisms. Recently, the coupling of mutation accumulation experiments and next-generation sequencing has greatly expanded our knowledge of the genomic mutation rate in both prokaryotes and eukaryotes. However, because of their transient nature, transcription errors have proven extremely difficult to quantify, and current estimates of transcription fidelity are derived from artificial constructs applied to just a few organisms. Here we report a unique cDNA library preparation technique that allows error detection in natural transcripts at the transcriptome-wide level. Application of this method to the model organism Caenorhabditis elegans revealed a base misincorporation rate in mRNAs of ~4 × 10(-6) per site, with a very biased molecular spectrum. Because the proposed method is readily applicable to other organisms, this innovation provides unique opportunities for studying the incidence of transcription errors across the tree of life.
RNA editing is an important cellular process by which the nucleotides in a mature RNA transcript are altered to cause them to differ from the corresponding DNA sequence. While this process yields essential transcripts in humans and other organisms, it is believed to occur at a relatively small number of loci. The rarity of RNA editing has been challenged by a recent comparison of human RNA and DNA sequence data from 27 individuals, which revealed that over 10,000 human exonic sites appear to exhibit RNA-DNA differences (RDDs). Many of these differences could not have been caused by either of the two previously known human RNA editing mechanisms--ADAR-mediated A?G substitutions or APOBEC1-mediated C?U switches--suggesting that a previously unknown mechanism of RNA editing may be active in humans. Here, we reanalyze these data and demonstrate that genomic sequences exist in these same individuals or in the human genome that match the majority of RDDs. Our results suggest that the majority of these RDD events were observed due to accurate transcription of sequences paralogous to the apparently edited gene but differing at the edited site. In light of our results it seems prudent to conclude that if indeed an unknown mechanism is causing RDD events in humans, such events occur at a much lower frequency than originally proposed.
Recent observations on rates of mutation, recombination, and random genetic drift highlight the dramatic ways in which fundamental evolutionary processes vary across the divide between unicellular microbes and multicellular eukaryotes. Moreover, population-genetic theory suggests that the range of variation in these parameters is sufficient to explain the evolutionary diversification of many aspects of genome size and gene structure found among phylogenetic lineages. Most notably, large eukaryotic organisms that experience elevated magnitudes of random genetic drift are susceptible to the passive accumulation of mutationally hazardous DNA that would otherwise be eliminated by efficient selection. Substantial evidence also suggests that variation in the population-genetic environment influences patterns of protein evolution, with the emergence of certain kinds of amino-acid substitutions and protein-protein complexes only being possible in populations with relatively small effective sizes. These observations imply that the ultimate origins of many of the major genomic and proteomic disparities between prokaryotes and eukaryotes and among eukaryotic lineages have been molded as much by intrinsic variation in the genetic and cellular features of species as by external ecological forces.
Like all ciliates, Paramecium tetraurelia is a unicellular eukaryote that harbors two kinds of nuclei within its cytoplasm. At each sexual cycle, a new somatic macronucleus (MAC) develops from the germ line micronucleus (MIC) through a sequence of complex events, which includes meiosis, karyogamy, and assembly of the MAC genome from MIC sequences. The latter process involves developmentally programmed genome rearrangements controlled by noncoding RNAs and a specialized RNA interference machinery. We describe our first attempts to identify genes and biological processes that contribute to the progression of the sexual cycle. Given the high percentage of unknown genes annotated in the P. tetraurelia genome, we applied a global strategy to monitor gene expression profiles during autogamy, a self-fertilization process. We focused this pilot study on the genes carried by the largest somatic chromosome and designed dedicated DNA arrays covering 484 genes from this chromosome (1.2% of all genes annotated in the genome). Transcriptome analysis revealed four major patterns of gene expression, including two successive waves of gene induction. Functional analysis of 15 upregulated genes revealed four that are essential for vegetative growth, one of which is involved in the maintenance of MAC integrity and another in cell division or membrane trafficking. Two additional genes, encoding a MIC-specific protein and a putative RNA helicase localizing to the old and then to the new MAC, are specifically required during sexual processes. Our work provides a proof of principle that genes essential for meiosis and nuclear reorganization can be uncovered following genome-wide transcriptome analysis.
Proteins of the Argonaute family are small RNA carriers that guide regulatory complexes to their targets. The family comprises two major subclades. Members of the Ago subclade, which are present in most eukaryotic phyla, bind different classes of small RNAs and regulate gene expression at both transcriptional and post-transcriptional levels. Piwi subclade members appear to have been lost in plants and fungi and were mostly studied in metazoa, where they bind piRNAs and have essential roles in sexual reproduction. Their presence in ciliates, unicellular organisms harbouring both germline micronuclei and somatic macronuclei, offers an interesting perspective on the evolution of their functions. Here, we report phylogenetic and functional analyses of the 15 Piwi genes from Paramecium tetraurelia. We show that four constitutively expressed proteins are involved in siRNA pathways that mediate gene silencing throughout the life cycle. Two other proteins, specifically expressed during meiosis, are required for accumulation of scnRNAs during sexual reproduction and for programmed genome rearrangements during development of the somatic macronucleus. Our results indicate that Paramecium Piwi proteins have evolved to perform both vegetative and sexual functions through mechanisms ranging from post-transcriptional mRNA cleavage to epigenetic regulation of genome rearrangements.
The genome of Paramecium tetraurelia, a unicellular model that belongs to the ciliate phylum, has been shaped by at least 3 successive whole genome duplications (WGD). These dramatic events, which have also been documented in plants, animals and fungi, are resolved over evolutionary time by the loss of one duplicate for the majority of genes. Thanks to a low rate of large scale genome rearrangement in Paramecium, an unprecedented large number of gene duplicates of different ages have been identified, making this organism an outstanding model to investigate the evolutionary consequences of polyploidization. The most recent WGD, with 51% of pre-duplication genes still in 2 copies, provides a snapshot of a phase of rapid gene loss that is not accessible in more ancient polyploids such as yeast.
The understanding of selective constraints affecting genes is a major issue in biology. It is well established that gene expression level is a major determinant of the rate of protein evolution, but the reasons for this relationship remain highly debated. Here we demonstrate that gene expression is also a major determinant of the evolution of gene dosage: the rate of gene losses after whole genome duplications in the Paramecium lineage is negatively correlated to the level of gene expression, and this relationship is not a byproduct of other factors known to affect the fate of gene duplicates. This indicates that changes in gene dosage are generally more deleterious for highly expressed genes. This rule also holds for other taxa: in yeast, we find a clear relationship between gene expression level and the fitness impact of reduction in gene dosage. To explain these observations, we propose a model based on the fact that the optimal expression level of a gene corresponds to a trade-off between the benefit and cost of its expression. This COSTEX model predicts that selective pressure against mutations changing gene expression level or affecting the encoded protein should on average be stronger in highly expressed genes and hence that both the frequency of gene loss and the rate of protein evolution should correlate negatively with gene expression. Thus, the COSTEX model provides a simple and common explanation for the general relationship observed between the level of gene expression and the different facets of gene evolution.
Distinct small RNA pathways are involved in the two types of homology-dependent effects described in Paramecium tetraurelia, as shown by a functional analysis of Dicer and Dicer-like genes and by the sequencing of small RNAs. The siRNAs that mediate post-transcriptional gene silencing when cells are fed with double-stranded RNA (dsRNA) were found to comprise two subclasses. DCR1-dependent cleavage of the inducing dsRNA generates approximately 23-nt primary siRNAs from both strands, while a different subclass of approximately 24-nt RNAs, characterized by a short untemplated poly-A tail, is strictly antisense to the targeted mRNA, suggestive of secondary siRNAs that depend on an RNA-dependent RNA polymerase. An entirely distinct pathway is responsible for homology-dependent regulation of developmental genome rearrangements after sexual reproduction. During early meiosis, the DCL2 and DCL3 genes are required for the production of a highly complex population of approximately 25-nt scnRNAs from all types of germline sequences, including both strands of exons, introns, intergenic regions, transposons and Internal Eliminated Sequences. A prominent 5-UNG signature, and a minor fraction showing the complementary signature at positions 21-23, indicate that scnRNAs are cleaved from dsRNA precursors as duplexes with 2-nt 3 overhangs at both ends, followed by preferential stabilization of the 5-UNG strand.
Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.