Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro and in vivo.
21 Related JoVE Articles!
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Mouse Genome Engineering Using Designer Nucleases
Institutions: University of Zurich, University of Minnesota.
Transgenic mice carrying site-specific genome modifications (knockout, knock-in) are of vital importance for dissecting complex biological systems as well as for modeling human diseases and testing therapeutic strategies. Recent advances in the use of designer nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system for site-specific genome engineering open the possibility to perform rapid targeted genome modification in virtually any laboratory species without the need to rely on embryonic stem (ES) cell technology. A genome editing experiment typically starts with identification of designer nuclease target sites within a gene of interest followed by construction of custom DNA-binding domains to direct nuclease activity to the investigator-defined genomic locus. Designer nuclease plasmids are in vitro
transcribed to generate mRNA for microinjection of fertilized mouse oocytes. Here, we provide a protocol for achieving targeted genome modification by direct injection of TALEN mRNA into fertilized mouse oocytes.
Genetics, Issue 86, Oocyte microinjection, Designer nucleases, ZFN, TALEN, Genome Engineering
Ablation of a Single Cell From Eight-cell Embryos of the Amphipod Crustacean Parhyale hawaiensis
Institutions: Harvard University.
The amphipod Parhyale hawaiensis
is a small crustacean found in intertidal marine habitats worldwide. Over the past decade, Parhyale
has emerged as a promising model organism for laboratory studies of development, providing a useful outgroup comparison to the well studied arthropod model organism Drosophila melanogaster
. In contrast to the syncytial cleavages of Drosophila
, the early cleavages of Parhyale
are holoblastic. Fate mapping using tracer dyes injected into early blastomeres have shown that all three germ layers and the germ line are established by the eight-cell stage. At this stage, three blastomeres are fated to give rise to the ectoderm, three are fated to give rise to the mesoderm, and the remaining two blastomeres are the precursors of the endoderm and germ line respectively. However, blastomere ablation experiments have shown that Parhyale
embryos also possess significant regulatory capabilities, such that the fates of blastomeres ablated at the eight-cell stage can be taken over by the descendants of some of the remaining blastomeres. Blastomere ablation has previously been described by one of two methods: injection and subsequent activation of phototoxic dyes or manual ablation. However, photoablation kills blastomeres but does not remove the dead cell body from the embryo. Complete physical removal of specific blastomeres may therefore be a preferred method of ablation for some applications. Here we present a protocol for manual removal of single blastomeres from the eight-cell stage of Parhyale
embryos, illustrating the instruments and manual procedures necessary for complete removal of the cell body while keeping the remaining blastomeres alive and intact. This protocol can be applied to any Parhyale
cell at the eight-cell stage, or to blastomeres of other early cleavage stages. In addition, in principle this protocol could be applicable to early cleavage stage embryos of other holoblastically cleaving marine invertebrates.
Developmental Biology, Issue 85, Amphipod, experimental embryology, micromere, germ line, ablation, developmental potential, vasa
Profiling of Estrogen-regulated MicroRNAs in Breast Cancer Cells
Institutions: University of Houston.
Estrogen plays vital roles in mammary gland development and breast cancer progression. It mediates its function by binding to and activating the estrogen receptors (ERs), ERα, and ERβ. ERα is frequently upregulated in breast cancer and drives the proliferation of breast cancer cells. The ERs function as transcription factors and regulate gene expression. Whereas ERα's regulation of protein-coding genes is well established, its regulation of noncoding microRNA (miRNA) is less explored. miRNAs play a major role in the post-transcriptional regulation of genes, inhibiting their translation or degrading their mRNA. miRNAs can function as oncogenes or tumor suppressors and are also promising biomarkers. Among the miRNA assays available, microarray and quantitative real-time polymerase chain reaction (qPCR) have been extensively used to detect and quantify miRNA levels. To identify miRNAs regulated by estrogen signaling in breast cancer, their expression in ERα-positive breast cancer cell lines were compared before and after estrogen-activation using both the µParaflo-microfluidic microarrays and Dual Labeled Probes-low density arrays. Results were validated using specific qPCR assays, applying both Cyanine dye-based and Dual Labeled Probes-based chemistry. Furthermore, a time-point assay was used to identify regulations over time. Advantages of the miRNA assay approach used in this study is that it enables a fast screening of mature miRNA regulations in numerous samples, even with limited sample amounts. The layout, including the specific conditions for cell culture and estrogen treatment, biological and technical replicates, and large-scale screening followed by in-depth confirmations using separate techniques, ensures a robust detection of miRNA regulations, and eliminates false positives and other artifacts. However, mutated or unknown miRNAs, or regulations at the primary and precursor transcript level, will not be detected. The method presented here represents a thorough investigation of estrogen-mediated miRNA regulation.
Medicine, Issue 84, breast cancer, microRNA, estrogen, estrogen receptor, microarray, qPCR
Analysis of RNA Processing Reactions Using Cell Free Systems: 3' End Cleavage of Pre-mRNA Substrates in vitro
Institutions: The Scripps Research Institute, City College of New York.
The 3’ end of mammalian mRNAs is not formed by abrupt termination of transcription by RNA polymerase II (RNPII). Instead, RNPII synthesizes precursor mRNA beyond the end of mature RNAs, and an active process of endonuclease activity is required at a specific site. Cleavage of the precursor RNA normally occurs 10-30 nt downstream from the consensus polyA site (AAUAAA) after the CA dinucleotides. Proteins from the cleavage complex, a multifactorial protein complex of approximately 800 kDa, accomplish this specific nuclease activity. Specific RNA sequences upstream and downstream of the polyA site control the recruitment of the cleavage complex. Immediately after cleavage, pre-mRNAs are polyadenylated by the polyA polymerase (PAP) to produce mature stable RNA messages.
Processing of the 3’ end of an RNA transcript may be studied using cellular nuclear extracts with specific radiolabeled RNA substrates. In sum, a long 32
P-labeled uncleaved precursor RNA is incubated with nuclear extracts in vitro
, and cleavage is assessed by gel electrophoresis and autoradiography. When proper cleavage occurs, a shorter 5’ cleaved product is detected and quantified. Here, we describe the cleavage assay in detail using, as an example, the 3’ end processing of HIV-1 mRNAs.
Infectious Diseases, Issue 87, Cleavage, Polyadenylation, mRNA processing, Nuclear extracts, 3' Processing Complex
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Efficient and Rapid Isolation of Early-stage Embryos from Arabidopsis thaliana Seeds
Institutions: University of Zürich.
In flowering plants, the embryo develops within a nourishing tissue - the endosperm - surrounded by the maternal seed integuments (or seed coat). As a consequence, the isolation of plant embryos at early stages (1 cell to globular stage) is technically challenging due to their relative inaccessibility. Efficient manual dissection at early stages is strongly impaired by the small size of young Arabidopsis
seeds and the adhesiveness of the embryo to the surrounding tissues. Here, we describe a method that allows the efficient isolation of young Arabidopsis
embryos, yielding up to 40 embryos in 1 hr to 4 hr, depending on the downstream application. Embryos are released into isolation buffer by slightly crushing 250-750 seeds with a plastic pestle in an Eppendorf tube. A glass microcapillary attached to either a standard laboratory pipette (via a rubber tube) or a hydraulically controlled microinjector is used to collect embryos from droplets placed on a multi-well slide on an inverted light microscope. The technical skills required are simple and easily transferable, and the basic setup does not require costly equipment. Collected embryos are suitable for a variety of downstream applications such as RT-PCR, RNA sequencing, DNA methylation analyses, fluorescence in situ
hybridization (FISH), immunostaining, and reporter gene assays.
Plant Biology, Issue 76, Cellular Biology, Developmental Biology, Molecular Biology, Genetics, Embryology, Embryo isolation, Arabidopsis thaliana, RNA amplification, transcriptomics, DNA methylation profiling, FISH, reporter assays
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Performing Custom MicroRNA Microarray Experiments
Institutions: University of Minnesota , University of Minnesota .
microRNAs (miRNAs) are a large family of ˜ 22 nucleotides (nt) long RNA molecules that are widely expressed in eukaryotes 1
. Complex genomes encode at least hundreds of miRNAs, which primarily inhibit the expression of a vast number of target genes post-transcriptionally 2, 3
. miRNAs control a broad range of biological processes 1
. In addition, altered miRNA expression has been associated with human diseases such as cancers, and miRNAs may serve as biomarkers for diseases and prognosis 4, 5
. It is important, therefore, to understand the expression and functions of miRNAs under many different conditions.
Three major approaches have been employed to profile miRNA expression: real-time PCR, microarray, and deep sequencing. The technique of miRNA microarray has the advantage of being high-throughput, generally less expensive, and most of the experimental and analysis steps can be carried out in a molecular biology laboratory at most universities, medical schools and associated hospitals. Here, we describe a method for performing custom miRNA microarray experiments. A miRNA probe set will be printed on glass slides to produce miRNA microarrays. RNA is isolated using a method or reagent that preserves small RNA species, and then labeled with a fluorescence dye. As a control, reference DNA oligonucleotides corresponding to a subset of miRNAs are also labeled with a different fluorescence dye. The reference DNA will serve to demonstrate the quality of the slide and hybridization and will also be used for data normalization. The RNA and DNA are mixed and hybridized to a microarray slide containing probes for most of the miRNAs in the database. After washing, the slide is scanned to obtain images, and intensities of the individual spots quantified. These raw signals will be further processed and analyzed as the expression data of the corresponding miRNAs. Microarray slides can be stripped and regenerated to reduce the cost of microarrays and to enhance the consistency of microarray experiments. The same principles and procedures are applicable to other types of custom microarray experiments.
Molecular Biology, Issue 56, Genetics, microRNA, custom microarray, oligonucleotide probes, RNA labeling
Genome-wide Screen for miRNA Targets Using the MISSION Target ID Library
The Target ID Library is designed to assist in discovery and identification of microRNA (miRNA) targets. The Target ID Library is a plasmid-based, genome-wide cDNA library cloned into the 3'UTR downstream from the dual-selection fusion protein, thymidine kinase-zeocin (TKzeo). The first round of selection is for stable transformants, followed with introduction of a miRNA of interest, and finally, selecting for cDNAs containing the miRNA's target. Selected cDNAs are identified by sequencing (see Figure 1-3 for Target ID Library Workflow and details).
To ensure broad coverage of the human transcriptome, Target ID Library cDNAs were generated via oligo-dT priming using a pool of total RNA prepared from multiple human tissues and cell lines. Resulting cDNA range from 0.5 to 4 kb, with an average size of 1.2 kb, and were cloned into the p3΄TKzeo dual-selection plasmid (see Figure 4 for plasmid map). The gene targets represented in the library can be found on the Sigma-Aldrich webpage. Results from Illumina sequencing (Table 3
), show that the library includes 16,922 of the 21,518 unique genes in UCSC RefGene (79%), or 14,000 genes with 10 or more reads (66%).
Genetics, Issue 62, Target ID, miRNA, ncRNA, RNAi, genomics
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Two Methods of Heterokaryon Formation to Discover HCV Restriction Factors
Institutions: Twincore, Centre for Experimental and Clinical Infection Research, The Rockefeller University, NY.
Hepatitis C virus (HCV) is a hepatotropic virus with a host-range restricted to humans and chimpanzees. Although HCV RNA replication has been observed in human non-hepatic and murine cell lines, the efficiency was very low and required long-term selection procedures using HCV replicon constructs expressing dominant antibiotic-selectable markers1-5
. HCV in vitro
research is therefore limited to human hepatoma cell lines permissive for virus entry and completion of the viral life cycle. Due to HCVs narrow species tropism, there is no immunocompetent small animal model available that sustains the complete HCV replication cycle 6-8
. Inefficient replication of HCV in non-human cells e.g. of mouse origin is likely due to lack of genetic incompatibility of essential host dependency factors and/or expression of restriction factors.
We investigated whether HCV propagation is suppressed by dominant restriction factors in either human cell lines derived from non-hepatic tissues or in mouse liver cell lines. To this end, we developed two independent conditional trans
-complementation methods relying on somatic cell fusion. In both cases, completion of the viral replication cycle is only possible in the heterokaryons. Consequently, successful trans
-complementation, which is determined by measuring de novo
production of infectious viral progeny, indicates absence of dominant restrictions.
Specifically, subgenomic HCV replicons carrying a luciferase transgene were transfected into highly permissive human hepatoma cells (Huh-7.5 cells). Subsequently, these cells were co-cultured and fused to various human and murine cells expressing HCV structural proteins core, envelope 1 and 2 (E1, E2) and accessory proteins p7 and NS2. Provided that cell fusion was initiated by treatment with polyethylene-glycol (PEG), the culture released infectious viral particles which infected naïve cells in a receptor-dependent fashion.
To assess the influence of dominant restrictions on the complete viral life cycle including cell entry, RNA translation, replication and virus assembly, we took advantage of a human liver cell line (Huh-7 Lunet N cells 9
) which lacks endogenous expression of CD81, an essential entry factor of HCV. In the absence of ectopically expressed CD81, these cells are essentially refractory to HCV infection 10
. Importantly, when co-cultured and fused with cells that express human CD81 but lack at least another crucial cell entry factor (i.e. SR-BI, CLDN1, OCLN), only the resulting heterokaryons display the complete set of HCV entry factors requisite for infection. Therefore, to analyze if dominant restriction factors suppress completion of the HCV replication cycle, we fused Lunet N cells with various cells from human and mouse origin which fulfill the above mentioned criteria. When co-cultured cells were transfected with a highly fusogenic viral envelope protein mutant of the prototype foamy virus (PFV11
) and subsequently challenged with infectious HCV particles (HCVcc), de novo
production of infectious virus was observed. This indicates that HCV successfully completed its replication cycle in heterokaryons thus ruling out expression of dominant restriction factors in these cell lines. These novel conditional trans
-complementation methods will be useful to screen a large panel of cell lines and primary cells for expression of HCV-specific dominant restriction factors.
Virology, Issue 65, Immunology, Molecular Biology, Genetics, cell fusion, HCV, restriction factor, heterokaryon, mouse, species-specificity
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Substrate Generation for Endonucleases of CRISPR/Cas Systems
Institutions: Max-Planck-Institute for Terrestrial Microbiology.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1) 1-3
. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4
. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5
. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8
. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9
. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10
. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro
T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology, Issue 67, CRISPR/Cas, endonuclease, in vitro transcription, crRNA, Cas6
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
PAR-CliP - A Method to Identify Transcriptome-wide the Binding Sites of RNA Binding Proteins
Institutions: Rockefeller University, Max-Delbrück-Center for Molecular Medicine, Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Rockefeller University.
RNA transcripts are subjected to post-transcriptional gene regulation by interacting with hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) that are often expressed in a cell-type dependently. To understand how the interplay of these RNA-binding factors affects the regulation of individual transcripts, high resolution maps of in vivo
protein-RNA interactions are necessary1
A combination of genetic, biochemical and computational approaches are typically applied to identify RNA-RBP or RNA-RNP interactions. Microarray profiling of RNAs associated with immunopurified RBPs (RIP-Chip)2
defines targets at a transcriptome level, but its application is limited to the characterization of kinetically stable interactions and only in rare cases3,4
allows to identify the RBP recognition element (RRE) within the long target RNA. More direct RBP target site information is obtained by combining in vivo
followed by the isolation of crosslinked RNA segments and cDNA sequencing (CLIP)10
. CLIP was used to identify targets of a number of RBPs11-17
. However, CLIP is limited by the low efficiency of UV 254 nm RNA-protein crosslinking, and the location of the crosslink is not readily identifiable within the sequenced crosslinked fragments, making it difficult to separate UV-crosslinked target RNA segments from background non-crosslinked RNA fragments also present in the sample.
We developed a powerful cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs that we term PAR-CliP (Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation) (see Fig. 1A for an outline of the method). The method relies on the incorporation of photoreactive ribonucleoside analogs, such as 4-thiouridine (4-SU) and 6-thioguanosine (6-SG) into nascent RNA transcripts by living cells. Irradiation of the cells by UV light of 365 nm induces efficient crosslinking of photoreactive nucleoside-labeled cellular RNAs to interacting RBPs. Immunoprecipitation of the RBP of interest is followed by isolation of the crosslinked and coimmunoprecipitated RNA. The isolated RNA is converted into a cDNA library and deep sequenced using Solexa technology. One characteristic feature of cDNA libraries prepared by PAR-CliP is that the precise position of crosslinking can be identified by mutations residing in the sequenced cDNA. When using 4-SU, crosslinked sequences thymidine to cytidine transition, whereas using 6-SG results in guanosine to adenosine mutations. The presence of the mutations in crosslinked sequences makes it possible to separate them from the background of sequences derived from abundant cellular RNAs.
Application of the method to a number of diverse RNA binding proteins was reported in Hafner et al.18
Cellular Biology, Issue 41, UV crosslinking, RNA binding proteins, RNA binding motif, 4-thiouridine, 6-thioguanosine