Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro and in vivo.
18 Related JoVE Articles!
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Highly Efficient Ligation of Small RNA Molecules for MicroRNA Quantitation by High-Throughput Sequencing
Institutions: University of Colorado, Boulder, University of Colorado, Denver.
MiRNA cloning and high-throughput sequencing, termed miR-Seq, stands alone as a transcriptome-wide approach to quantify miRNAs with single nucleotide resolution. This technique captures miRNAs by attaching 3’ and 5’ oligonucleotide adapters to miRNA molecules and allows de novo
miRNA discovery. Coupling with powerful next-generation sequencing platforms, miR-Seq has been instrumental in the study of miRNA biology. However, significant biases introduced by oligonucleotide ligation steps have prevented miR-Seq from being employed as an accurate quantitation tool. Previous studies demonstrate that biases in current miR-Seq methods often lead to inaccurate miRNA quantification with errors up to 1,000-fold for some miRNAs1,2
. To resolve these biases imparted by RNA ligation, we have developed a small RNA ligation method that results in ligation efficiencies of over 95% for both 3’ and 5′ ligation steps. Benchmarking this improved library construction method using equimolar or differentially mixed synthetic miRNAs, consistently yields reads numbers with less than two-fold deviation from the expected value. Furthermore, this high-efficiency miR-Seq method permits accurate genome-wide miRNA profiling from in vivo
total RNA samples2
Molecular Biology, Issue 93, RNA, ligation, miRNA, miR-Seq, linker, oligonucleotide, high-throughput sequencing
Analysis of Tubular Membrane Networks in Cardiac Myocytes from Atria and Ventricles
Institutions: Heart Research Center Goettingen, University Medical Center Goettingen, German Center for Cardiovascular Research (DZHK) partner site Goettingen, University of Maryland School of Medicine.
In cardiac myocytes a complex network of membrane tubules - the transverse-axial tubule system (TATS) - controls deep intracellular signaling functions. While the outer surface membrane and associated TATS membrane components appear to be continuous, there are substantial differences in lipid and protein content. In ventricular myocytes (VMs), certain TATS components are highly abundant contributing to rectilinear tubule networks and regular branching 3D architectures. It is thought that peripheral TATS components propagate action potentials from the cell surface to thousands of remote intracellular sarcoendoplasmic reticulum (SER) membrane contact domains, thereby activating intracellular Ca2+
release units (CRUs). In contrast to VMs, the organization and functional role of TATS membranes in atrial myocytes (AMs) is significantly different and much less understood. Taken together, quantitative structural characterization of TATS membrane networks in healthy and diseased myocytes is an essential prerequisite towards better understanding of functional plasticity and pathophysiological reorganization. Here, we present a strategic combination of protocols for direct quantitative analysis of TATS membrane networks in living VMs and AMs. For this, we accompany primary cell isolations of mouse VMs and/or AMs with critical quality control steps and direct membrane staining protocols for fluorescence imaging of TATS membranes. Using an optimized workflow for confocal or superresolution TATS image processing, binarized and skeletonized data are generated for quantitative analysis of the TATS network and its components. Unlike previously published indirect regional aggregate image analysis strategies, our protocols enable direct characterization of specific components and derive complex physiological properties of TATS membrane networks in living myocytes with high throughput and open access software tools. In summary, the combined protocol strategy can be readily applied for quantitative TATS network studies during physiological myocyte adaptation or disease changes, comparison of different cardiac or skeletal muscle cell types, phenotyping of transgenic models, and pharmacological or therapeutic interventions.
Bioengineering, Issue 92, cardiac myocyte, atria, ventricle, heart, primary cell isolation, fluorescence microscopy, membrane tubule, transverse-axial tubule system, image analysis, image processing, T-tubule, collagenase
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Detecting Somatic Genetic Alterations in Tumor Specimens by Exon Capture and Massively Parallel Sequencing
Institutions: Memorial Sloan-Kettering Cancer Center, Memorial Sloan-Kettering Cancer Center.
Efforts to detect and investigate key oncogenic mutations have proven valuable to facilitate the appropriate treatment for cancer patients. The establishment of high-throughput, massively parallel "next-generation" sequencing has aided the discovery of many such mutations. To enhance the clinical and translational utility of this technology, platforms must be high-throughput, cost-effective, and compatible with formalin-fixed paraffin embedded (FFPE) tissue samples that may yield small amounts of degraded or damaged DNA. Here, we describe the preparation of barcoded and multiplexed DNA libraries followed by hybridization-based capture of targeted exons for the detection of cancer-associated mutations in fresh frozen and FFPE tumors by massively parallel sequencing. This method enables the identification of sequence mutations, copy number alterations, and select structural rearrangements involving all targeted genes. Targeted exon sequencing offers the benefits of high throughput, low cost, and deep sequence coverage, thus conferring high sensitivity for detecting low frequency mutations.
Molecular Biology, Issue 80, Molecular Diagnostic Techniques, High-Throughput Nucleotide Sequencing, Genetics, Neoplasms, Diagnosis, Massively parallel sequencing, targeted exon sequencing, hybridization capture, cancer, FFPE, DNA mutations
Efficient and Rapid Isolation of Early-stage Embryos from Arabidopsis thaliana Seeds
Institutions: University of Zürich.
In flowering plants, the embryo develops within a nourishing tissue - the endosperm - surrounded by the maternal seed integuments (or seed coat). As a consequence, the isolation of plant embryos at early stages (1 cell to globular stage) is technically challenging due to their relative inaccessibility. Efficient manual dissection at early stages is strongly impaired by the small size of young Arabidopsis
seeds and the adhesiveness of the embryo to the surrounding tissues. Here, we describe a method that allows the efficient isolation of young Arabidopsis
embryos, yielding up to 40 embryos in 1 hr to 4 hr, depending on the downstream application. Embryos are released into isolation buffer by slightly crushing 250-750 seeds with a plastic pestle in an Eppendorf tube. A glass microcapillary attached to either a standard laboratory pipette (via a rubber tube) or a hydraulically controlled microinjector is used to collect embryos from droplets placed on a multi-well slide on an inverted light microscope. The technical skills required are simple and easily transferable, and the basic setup does not require costly equipment. Collected embryos are suitable for a variety of downstream applications such as RT-PCR, RNA sequencing, DNA methylation analyses, fluorescence in situ
hybridization (FISH), immunostaining, and reporter gene assays.
Plant Biology, Issue 76, Cellular Biology, Developmental Biology, Molecular Biology, Genetics, Embryology, Embryo isolation, Arabidopsis thaliana, RNA amplification, transcriptomics, DNA methylation profiling, FISH, reporter assays
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Trajectory Data Analyses for Pedestrian Space-time Activity Study
Institutions: Kean University, University of Wisconsin-Madison.
It is well recognized that human movement in the spatial and temporal dimensions has direct influence on disease transmission1-3
. An infectious disease typically spreads via contact between infected and susceptible individuals in their overlapped activity spaces. Therefore, daily mobility-activity information can be used as an indicator to measure exposures to risk factors of infection. However, a major difficulty and thus the reason for paucity of studies of infectious disease transmission at the micro scale arise from the lack of detailed individual mobility data. Previously in transportation and tourism research detailed space-time activity data often relied on the time-space diary technique, which requires subjects to actively record their activities in time and space. This is highly demanding for the participants and collaboration from the participants greatly affects the quality of data4
Modern technologies such as GPS and mobile communications have made possible the automatic collection of trajectory data. The data collected, however, is not ideal for modeling human space-time activities, limited by the accuracies of existing devices. There is also no readily available tool for efficient processing of the data for human behavior study. We present here a suite of methods and an integrated ArcGIS desktop-based visual interface for the pre-processing and spatiotemporal analyses of trajectory data. We provide examples of how such processing may be used to model human space-time activities, especially with error-rich pedestrian trajectory data, that could be useful in public health studies such as infectious disease transmission modeling.
The procedure presented includes pre-processing, trajectory segmentation, activity space characterization, density estimation and visualization, and a few other exploratory analysis methods. Pre-processing is the cleaning of noisy raw trajectory data. We introduce an interactive visual pre-processing interface as well as an automatic module. Trajectory segmentation5
involves the identification of indoor and outdoor parts from pre-processed space-time tracks. Again, both interactive visual segmentation and automatic segmentation are supported. Segmented space-time tracks are then analyzed to derive characteristics of one's activity space such as activity radius etc.
Density estimation and visualization are used to examine large amount of trajectory data to model hot spots and interactions. We demonstrate both density surface mapping6
and density volume rendering7
. We also include a couple of other exploratory data analyses (EDA) and visualizations tools, such as Google Earth animation support and connection analysis. The suite of analytical as well as visual methods presented in this paper may be applied to any trajectory data for space-time activity studies.
Environmental Sciences, Issue 72, Computer Science, Behavior, Infectious Diseases, Geography, Cartography, Data Display, Disease Outbreaks, cartography, human behavior, Trajectory data, space-time activity, GPS, GIS, ArcGIS, spatiotemporal analysis, visualization, segmentation, density surface, density volume, exploratory data analysis, modelling
Depletion of Ribosomal RNA for Mosquito Gut Metagenomic RNA-seq
Institutions: New Mexico State University.
The mosquito gut accommodates dynamic microbial communities across different stages of the insect's life cycle. Characterization of the genetic capacity and functionality of the gut community will provide insight into the effects of gut microbiota on mosquito life traits. Metagenomic RNA-Seq has become an important tool to analyze transcriptomes from various microbes present in a microbial community. Messenger RNA usually comprises only 1-3% of total RNA, while rRNA constitutes approximately 90%. It is challenging to enrich messenger RNA from a metagenomic microbial RNA sample because most prokaryotic mRNA species lack stable poly(A) tails. This prevents oligo d(T) mediated mRNA isolation. Here, we describe a protocol that employs sample derived rRNA capture probes to remove rRNA from a metagenomic total RNA sample. To begin, both mosquito and microbial small and large subunit rRNA fragments are amplified from a metagenomic community DNA sample. Then, the community specific biotinylated antisense ribosomal RNA probes are synthesized in vitro
using T7 RNA polymerase. The biotinylated rRNA probes are hybridized to the total RNA. The hybrids are captured by streptavidin-coated beads and removed from the total RNA. This subtraction-based protocol efficiently removes both mosquito and microbial rRNA from the total RNA sample. The mRNA enriched sample is further processed for RNA amplification and RNA-Seq.
Genetics, Issue 74, Infection, Infectious Diseases, Molecular Biology, Cellular Biology, Microbiology, Genomics, biology (general), genetics (animal and plant), life sciences, Eukaryota, Bacteria, metagenomics, metatranscriptome, RNA-seq, rRNA depletion, mRNA enrichment, mosquito gut microbiome, RNA, DNA, sequencing
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
Quantitative and Automated High-throughput Genome-wide RNAi Screens in C. elegans
Institutions: Université de la Méditerranée.
RNA interference is a powerful method to understand gene function, especially when conducted at a whole-genome scale and in a quantitative context. In C. elegans
, gene function can be knocked down simply and efficiently by feeding worms with bacteria expressing a dsRNA corresponding to a specific gene 1
. While the creation of libraries of RNAi clones covering most of the C. elegans
opened the way for true functional genomic studies (see for example 4-7
), most established methods are laborious. Moy and colleagues have developed semi-automated protocols that facilitate genome-wide screens 8
. The approach relies on microscopic imaging and image analysis.
Here we describe an alternative protocol for a high-throughput genome-wide screen, based on robotic handling of bacterial RNAi clones, quantitative analysis using the COPAS Biosort (Union Biometrica (UBI)), and an integrated software: the MBioLIMS (Laboratory Information Management System from Modul-Bio) a technology that provides increased throughput for data management and sample tracking. The method allows screens to be conducted on solid medium plates. This is particularly important for some studies, such as those addressing host-pathogen interactions in C. elegans
, since certain microbes do not efficiently infect worms in liquid culture.
We show how the method can be used to quantify the importance of genes in anti-fungal innate immunity in C. elegans
. In this case, the approach relies on the use of a transgenic strain carrying an epidermal infection-inducible fluorescent reporter gene, with GFP under the control of the promoter of the antimicrobial peptide gene nlp 29
and a red fluorescent reporter that is expressed constitutively in the epidermis. The latter provides an internal control for the functional integrity of the epidermis and nonspecific transgene silencing9
. When control worms are infected by the fungus they fluoresce green. Knocking down by RNAi a gene required for nlp 29
expression results in diminished fluorescence after infection. Currently, this protocol allows more than 3,000 RNAi clones to be tested and analyzed per week, opening the possibility of screening the entire genome in less than 2 months.
Molecular Biology, Issue 60, C. elegans, fluorescent reporter, Biosort, LIMS, innate immunity, Drechmeria coniospora
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
iCLIP - Transcriptome-wide Mapping of Protein-RNA Interactions with Individual Nucleotide Resolution
Institutions: Medical Research Council - MRC, EMBL Heidelberg, University of Ljubljana, Wellcome Trust Sanger Institute.
The unique composition and spatial arrangement of RNA-binding proteins (RBPs) on a transcript guide the diverse aspects of post-transcriptional regulation1
. Therefore, an essential step towards understanding transcript regulation at the molecular level is to gain positional information on the binding sites of RBPs2
Protein-RNA interactions can be studied using biochemical methods, but these approaches do not address RNA binding in its native cellular context. Initial attempts to study protein-RNA complexes in their cellular environment employed affinity purification or immunoprecipitation combined with differential display or microarray analysis (RIP-CHIP)3-5
. These approaches were prone to identifying indirect or non-physiological interactions6
. In order to increase the specificity and positional resolution, a strategy referred to as CLIP (UV cross-linking and immunoprecipitation) was introduced7,8
. CLIP combines UV cross-linking of proteins and RNA molecules with rigorous purification schemes including denaturing polyacrylamide gel electrophoresis. In combination with high-throughput sequencing technologies, CLIP has proven as a powerful tool to study protein-RNA interactions on a genome-wide scale (referred to as HITS-CLIP or CLIP-seq)9,10
. Recently, PAR-CLIP was introduced that uses photoreactive ribonucleoside analogs for cross-linking11,12
Despite the high specificity of the obtained data, CLIP experiments often generate cDNA libraries of limited sequence complexity. This is partly due to the restricted amount of co-purified RNA and the two inefficient RNA ligation reactions required for library preparation. In addition, primer extension assays indicated that many cDNAs truncate prematurely at the crosslinked nucleotide13
. Such truncated cDNAs are lost during the standard CLIP library preparation protocol. We recently developed iCLIP (individual-nucleotide resolution CLIP), which captures the truncated cDNAs by replacing one of the inefficient intermolecular RNA ligation steps with a more efficient intramolecular cDNA circularization (Figure 1)14
. Importantly, sequencing the truncated cDNAs provides insights into the position of the cross-link site at nucleotide resolution. We successfully applied iCLIP to study hnRNP C particle organization on a genome-wide scale and assess its role in splicing regulation14
Cellular Biology, Issue 50, RNA biochemistry, transcriptome, systems biology, RNA-binding protein