Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro and in vivo.
22 Related JoVE Articles!
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
A High-throughput Automated Platform for the Development of Manufacturing Cell Lines for Protein Therapeutics
Institutions: Merck & Co., Inc.
The fast-growing biopharmaceutical industry demands speedy development of highly efficient and reliable production systems to meet the increasing requirement for drug supplies. The generation of production cell lines has traditionally involved manual operations that are labor-intensive, low-throughput and vulnerable to human errors. We report here an integrated high-throughput and automated platform for development of manufacturing cell lines for the production of protein therapeutics.
The combination of BD FACS Aria Cell Sorter, CloneSelect Imager and TECAN Freedom EVO liquid handling system has enabled a high-throughput and more efficient cell line development process. In this operation, production host cells are first transfected with an expression vector carrying the gene of interest 1
, followed by the treatment with a selection agent. The stably-transfected cells are then stained with fluorescence-labeled anti-human IgG antibody, and are subsequently subject to flow cytometry analysis 2-4
. Highly productive cells are selected based on fluorescence intensity and are isolated by single-cell sorting on a BD FACSAria. Colony formation from single-cell stage was detected microscopically and a series of time-laps digital images are taken by CloneSelect Imager for the documentation of cell line history. After single clones have formed, these clones were screened for productivity by ELISA performed on a TECAN Freedom EVO liquid handling system. Approximately 2,000 - 10,000 clones can be screened per operation cycle with the current system setup.
This integrated approach has been used to generate high producing Chinese hamster ovary (CHO) cell lines for the production of therapeutic monoclonal antibody (mAb) as well as their fusion proteins. With the aid of different types of detecting probes, the method can be used for developing other protein therapeutics or be applied to other production host systems. Comparing to the traditional manual procedure, this automated platform demonstrated advantages of significantly increased capacity, ensured clonality, traceability in cell line history with electronic documentation and much reduced opportunity in operator error.
Medicine, Issue 55, Manufacturing cell line, protein therapeutics, automation, high-throughput, FACS, FACS Aria, CloneSelect Imager, TECAN Freedom EVO liquid handling system
High-throughput Physical Mapping of Chromosomes using Automated in situ Hybridization
Institutions: Virginia Tech.
Projects to obtain whole-genome sequences for 10,000 vertebrate species1
and for 5,000 insect and related arthropod species2
are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform3,4
. This Anopheles
species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo
. Moreover, the existing genome assemblies for Anopheles gambiae
, although obtained using the Sanger method, are gapped or fragmented4,6
Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations7,8
Here, we present innovative approaches to chromosome preparation, fluorescent in situ
hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae
as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila9
, allows the user to visualize more details on chromosomes than the regular squashing technique10
. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time11
. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH12
. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.
Genetics, Issue 64, Entomology, Molecular Biology, Genomics, automation, chromosome, genome, hybridization, labeling, mapping, mosquito
A New Approach for the Comparative Analysis of Multiprotein Complexes Based on 15N Metabolic Labeling and Quantitative Mass Spectrometry
Institutions: University of Münster, Carnegie Institution for Science.
The introduced protocol provides a tool for the analysis of multiprotein complexes in the thylakoid membrane, by revealing insights into complex composition under different conditions. In this protocol the approach is demonstrated by comparing the composition of the protein complex responsible for cyclic electron flow (CEF) in Chlamydomonas reinhardtii
, isolated from genetically different strains. The procedure comprises the isolation of thylakoid membranes, followed by their separation into multiprotein complexes by sucrose density gradient centrifugation, SDS-PAGE, immunodetection and comparative, quantitative mass spectrometry (MS) based on differential metabolic labeling (14
N) of the analyzed strains. Detergent solubilized thylakoid membranes are loaded on sucrose density gradients at equal chlorophyll concentration. After ultracentrifugation, the gradients are separated into fractions, which are analyzed by mass-spectrometry based on equal volume. This approach allows the investigation of the composition within the gradient fractions and moreover to analyze the migration behavior of different proteins, especially focusing on ANR1, CAS, and PGRL1. Furthermore, this method is demonstrated by confirming the results with immunoblotting and additionally by supporting the findings from previous studies (the identification and PSI-dependent migration of proteins that were previously described to be part of the CEF-supercomplex such as PGRL1, FNR, and cyt f
). Notably, this approach is applicable to address a broad range of questions for which this protocol can be adopted and e.g.
used for comparative analyses of multiprotein complex composition isolated from distinct environmental conditions.
Microbiology, Issue 85, Sucrose density gradients, Chlamydomonas, multiprotein complexes, 15N metabolic labeling, thylakoids
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Generation of Comprehensive Thoracic Oncology Database - Tool for Translational Research
Institutions: University of Chicago, University of Chicago, Northshore University Health Systems, University of Chicago, University of Chicago, University of Chicago.
The Thoracic Oncology Program Database Project was created to serve as a comprehensive, verified, and accessible repository for well-annotated cancer specimens and clinical data to be available to researchers within the Thoracic Oncology Research Program. This database also captures a large volume of genomic and proteomic data obtained from various tumor tissue studies. A team of clinical and basic science researchers, a biostatistician, and a bioinformatics expert was convened to design the database. Variables of interest were clearly defined and their descriptions were written within a standard operating manual to ensure consistency of data annotation. Using a protocol for prospective tissue banking and another protocol for retrospective banking, tumor and normal tissue samples from patients consented to these protocols were collected. Clinical information such as demographics, cancer characterization, and treatment plans for these patients were abstracted and entered into an Access database. Proteomic and genomic data have been included in the database and have been linked to clinical information for patients described within the database. The data from each table were linked using the relationships function in Microsoft Access to allow the database manager to connect clinical and laboratory information during a query. The queried data can then be exported for statistical analysis and hypothesis generation.
Medicine, Issue 47, Database, Thoracic oncology, Bioinformatics, Biorepository, Microsoft Access, Proteomics, Genomics
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
FIBS-enabled Noninvasive Metabolic Profiling
Institutions: Imperial College London, Imperial College London.
In the era of computational biology, new high throughput experimental systems are necessary in order to populate and refine models so that they can be validated for predictive purposes. Ideally such systems would be low volume, which precludes sampling and destructive analyses when time course data are to be obtained. What is needed is an in situ
monitoring tool which can report the necessary information in real-time and noninvasively. An interesting option is the use of fluorescent, protein-based in vivo
biological sensors as reporters of intracellular concentrations. One particular class of in vivo
biosensors that has found applications in metabolite quantification is based on Förster Resonance Energy Transfer (FRET) between two fluorescent proteins connected by a ligand binding domain. FRET integrated biological sensors (FIBS) are constitutively produced within the cell line, they have fast response times and their spectral characteristics change based on the concentration of metabolite within the cell. In this paper, the method for constructing Chinese hamster ovary (CHO) cell lines that constitutively express a FIBS for glucose and glutamine and calibrating the FIBS in vivo
in batch cell culture in order to enable future quantification of intracellular metabolite concentration is described. Data from fed-batch CHO cell cultures demonstrates that the FIBS was able in each case to detect the resulting change in the intracellular concentration. Using the fluorescent signal from the FIBS and the previously constructed calibration curve, the intracellular concentration was accurately determined as confirmed by an independent enzymatic assay.
Bioengineering, Issue 84, metabolite monitoring, in vivo biosensors, in situ monitoring, mammalian cell culture, bioprocess engineering, medium formulation
Selective Labelling of Cell-surface Proteins using CyDye DIGE Fluor Minimal Dyes
Institutions: GE Healthcare Bio-Sciences AB.
Surface proteins are central to the cell's ability to react to its environment and to interact with neighboring cells. They are known to be inducers of almost all intracellular signaling. Moreover, they play an important role in environmental adaptation and drug treatment, and are often involved in disease pathogenesis and pathology (1). Protein-protein interactions are intrinsic to signaling pathways, and to gain more insight in these complex biological processes, sensitive and reliable methods are needed for studying cell surface proteins. Two-dimensional (2-D) electrophoresis is used extensively for detection of biomarkers and other targets in complex protein samples to study differential changes. Cell surface proteins, partly due to their low abundance (1 2% of cellular proteins), are difficult to detect in a 2-D gel without fractionation or some other type of enrichment. They are also often poorly represented in 2-D gels due to their hydrophobic nature and high molecular weight (2). In this study, we present a new protocol for intact cells using CyDye DIGE Fluor minimal dyes for specific labeling and detection of this important group of proteins. The results showed specific labeling of a large number of cell surface proteins with minimal labeling of intracellular proteins. This protocol is rapid, simple to use, and all three CyDye DIGE Fluor minimal dyes (Cy 2, Cy 3 and Cy 5) can be used to label cell-surface proteins. These features allow for multiplexing using the 2-D Fluorescence Difference Gel Electrophoresis (2-D DIGE) with Ettan DIGE technology and analysis of protein expression changes using DeCyder 2-D Differential Analysis Software. The level of cell-surface proteins was followed during serum starvation of CHO cells for various lengths of time (see Table 1). Small changes in abundance were detected with high accuracy, and results are supported by defined statistical methods.
Biochemistry, Issue 21, Cell surface protein labelling, Ettan DIGE, CyDye DIGE Fluor minimal dyes, cell surface proteins, receptors, fluorescence, 2-D electrophoresis
PAR-CliP - A Method to Identify Transcriptome-wide the Binding Sites of RNA Binding Proteins
Institutions: Rockefeller University, Max-Delbrück-Center for Molecular Medicine, Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Rockefeller University.
RNA transcripts are subjected to post-transcriptional gene regulation by interacting with hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) that are often expressed in a cell-type dependently. To understand how the interplay of these RNA-binding factors affects the regulation of individual transcripts, high resolution maps of in vivo
protein-RNA interactions are necessary1
A combination of genetic, biochemical and computational approaches are typically applied to identify RNA-RBP or RNA-RNP interactions. Microarray profiling of RNAs associated with immunopurified RBPs (RIP-Chip)2
defines targets at a transcriptome level, but its application is limited to the characterization of kinetically stable interactions and only in rare cases3,4
allows to identify the RBP recognition element (RRE) within the long target RNA. More direct RBP target site information is obtained by combining in vivo
followed by the isolation of crosslinked RNA segments and cDNA sequencing (CLIP)10
. CLIP was used to identify targets of a number of RBPs11-17
. However, CLIP is limited by the low efficiency of UV 254 nm RNA-protein crosslinking, and the location of the crosslink is not readily identifiable within the sequenced crosslinked fragments, making it difficult to separate UV-crosslinked target RNA segments from background non-crosslinked RNA fragments also present in the sample.
We developed a powerful cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs that we term PAR-CliP (Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation) (see Fig. 1A for an outline of the method). The method relies on the incorporation of photoreactive ribonucleoside analogs, such as 4-thiouridine (4-SU) and 6-thioguanosine (6-SG) into nascent RNA transcripts by living cells. Irradiation of the cells by UV light of 365 nm induces efficient crosslinking of photoreactive nucleoside-labeled cellular RNAs to interacting RBPs. Immunoprecipitation of the RBP of interest is followed by isolation of the crosslinked and coimmunoprecipitated RNA. The isolated RNA is converted into a cDNA library and deep sequenced using Solexa technology. One characteristic feature of cDNA libraries prepared by PAR-CliP is that the precise position of crosslinking can be identified by mutations residing in the sequenced cDNA. When using 4-SU, crosslinked sequences thymidine to cytidine transition, whereas using 6-SG results in guanosine to adenosine mutations. The presence of the mutations in crosslinked sequences makes it possible to separate them from the background of sequences derived from abundant cellular RNAs.
Application of the method to a number of diverse RNA binding proteins was reported in Hafner et al.18
Cellular Biology, Issue 41, UV crosslinking, RNA binding proteins, RNA binding motif, 4-thiouridine, 6-thioguanosine
An Affordable HIV-1 Drug Resistance Monitoring Method for Resource Limited Settings
Institutions: University of KwaZulu-Natal, Durban, South Africa, Jembi Health Systems, University of Amsterdam, Stanford Medical School.
HIV-1 drug resistance has the potential to seriously compromise the effectiveness and impact of antiretroviral therapy (ART). As ART programs in sub-Saharan Africa continue to expand, individuals on ART should be closely monitored for the emergence of drug resistance. Surveillance of transmitted drug resistance to track transmission of viral strains already resistant to ART is also critical. Unfortunately, drug resistance testing is still not readily accessible in resource limited settings, because genotyping is expensive and requires sophisticated laboratory and data management infrastructure. An open access genotypic drug resistance monitoring method to manage individuals and assess transmitted drug resistance is described. The method uses free open source software for the interpretation of drug resistance patterns and the generation of individual patient reports. The genotyping protocol has an amplification rate of greater than 95% for plasma samples with a viral load >1,000 HIV-1 RNA copies/ml. The sensitivity decreases significantly for viral loads <1,000 HIV-1 RNA copies/ml. The method described here was validated against a method of HIV-1 drug resistance testing approved by the United States Food and Drug Administration (FDA), the Viroseq genotyping method. Limitations of the method described here include the fact that it is not automated and that it also failed to amplify the circulating recombinant form CRF02_AG from a validation panel of samples, although it amplified subtypes A and B from the same panel.
Medicine, Issue 85, Biomedical Technology, HIV-1, HIV Infections, Viremia, Nucleic Acids, genetics, antiretroviral therapy, drug resistance, genotyping, affordable
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Development of Cell-type specific anti-HIV gp120 aptamers for siRNA delivery
Institutions: Beckman Research Institute of City of Hope, Beckman Research Institute of City of Hope, Beckman Research Institute of City of Hope.
The global epidemic of infection by HIV has created an urgent need for new classes of antiretroviral agents. The potent ability of small interfering (si)RNAs to inhibit the expression of complementary RNA transcripts is being exploited as a new class of therapeutics for a variety of diseases including HIV. Many previous reports have shown that novel RNAi-based anti-HIV/AIDS therapeutic strategies have considerable promise; however, a key obstacle to the successful therapeutic application and clinical translation of siRNAs is efficient delivery. Particularly, considering the safety and efficacy of RNAi-based therapeutics, it is highly desirable to develop a targeted intracellular siRNA delivery approach to specific cell populations or tissues. The HIV-1 gp120 protein, a glycoprotein envelope on the surface of HIV-1, plays an important role in viral entry into CD4 cells. The interaction of gp120 and CD4 that triggers HIV-1 entry and initiates cell fusion has been validated as a clinically relevant anti-viral strategy for drug discovery.
Herein, we firstly discuss the selection and identification of 2'-F modified anti-HIV gp120 RNA aptamers. Using a conventional nitrocellulose filter SELEX method, several new aptamers with nanomolar affinity were isolated from a 50 random nt RNA library. In order to successfully obtain bound species with higher affinity, the selection stringency is carefully controlled by adjusting the conditions. The selected aptamers can specifically bind and be rapidly internalized into cells expressing the HIV-1 envelope protein. Additionally, the aptamers alone can neutralize HIV-1 infectivity. Based upon the best aptamer A-1, we also create a novel dual inhibitory function anti-gp120 aptamer-siRNA chimera in which both the aptamer and the siRNA portions have potent anti-HIV activities. Further, we utilize the gp120 aptamer-siRNA chimeras for cell-type specific delivery of the siRNA into HIV-1 infected cells. This dual function chimera shows considerable potential for combining various nucleic acid therapeutic agents (aptamer and siRNA) in suppressing HIV-1 infection, making the aptamer-siRNA chimeras attractive therapeutic candidates for patients failing highly active antiretroviral therapy (HAART).
Immunology, Issue 52, SELEX (Systematic Evolution of Ligands by EXponential enrichment), RNA aptamer, HIV-1 gp120, RNAi (RNA interference), siRNA (small interfering RNA), cell-type specific delivery
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Mouse Genome Engineering Using Designer Nucleases
Institutions: University of Zurich, University of Minnesota.
Transgenic mice carrying site-specific genome modifications (knockout, knock-in) are of vital importance for dissecting complex biological systems as well as for modeling human diseases and testing therapeutic strategies. Recent advances in the use of designer nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system for site-specific genome engineering open the possibility to perform rapid targeted genome modification in virtually any laboratory species without the need to rely on embryonic stem (ES) cell technology. A genome editing experiment typically starts with identification of designer nuclease target sites within a gene of interest followed by construction of custom DNA-binding domains to direct nuclease activity to the investigator-defined genomic locus. Designer nuclease plasmids are in vitro
transcribed to generate mRNA for microinjection of fertilized mouse oocytes. Here, we provide a protocol for achieving targeted genome modification by direct injection of TALEN mRNA into fertilized mouse oocytes.
Genetics, Issue 86, Oocyte microinjection, Designer nucleases, ZFN, TALEN, Genome Engineering
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing
Fluorescent Labeling of COS-7 Expressing SNAP-tag Fusion Proteins for Live Cell Imaging
Institutions: New England Biolabs.
SNAP-tag and CLIP-tag protein labeling systems enable the specific, covalent attachment of molecules, including fluorescent dyes, to a protein of interest in live cells. These systems offer a broad selection of fluorescent substrates optimized for a range of imaging instrumentation. Once cloned and expressed, the tagged protein can be used with a variety of substrates for numerous downstream applications without having to clone again. There are two steps to using this system: cloning and expression of the protein of interest as a SNAP-tag fusion, and labeling of the fusion with the SNAP-tag substrate of choice. The SNAP-tag is a small protein based on human O6
-alkylguanine-DNA-alkyltransferase (hAGT), a DNA repair protein. SNAP-tag labels are dyes conjugated to guanine or chloropyrimidine leaving groups via a benzyl linker. In the labeling reaction, the substituted benzyl group of the substrate is covalently attached to the SNAP-tag. CLIP-tag is a modified version of SNAP-tag, engineered to react with benzylcytosine rather than benzylguanine derivatives. When used in conjunction with SNAP-tag, CLIP-tag enables the orthogonal and complementary labeling of two proteins simultaneously in the same cells.
Cellular Biology, Issue 39, fluorescence, labeling, imaging, SNAP-tag, tag, microscopy, AGT, surface, intracellular, fusion