Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro and in vivo.
25 Related JoVE Articles!
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Purification of Transcripts and Metabolites from Drosophila Heads
Institutions: University of Florida , University of Florida , University of Florida , University of Florida .
For the last decade, we have tried to understand the molecular and cellular mechanisms of neuronal degeneration using Drosophila
as a model organism. Although fruit flies provide obvious experimental advantages, research on neurodegenerative diseases has mostly relied on traditional techniques, including genetic interaction, histology, immunofluorescence, and protein biochemistry. These techniques are effective for mechanistic, hypothesis-driven studies, which lead to a detailed understanding of the role of single genes in well-defined biological problems. However, neurodegenerative diseases are highly complex and affect multiple cellular organelles and processes over time. The advent of new technologies and the omics age provides a unique opportunity to understand the global cellular perturbations underlying complex diseases. Flexible model organisms such as Drosophila
are ideal for adapting these new technologies because of their strong annotation and high tractability. One challenge with these small animals, though, is the purification of enough informational molecules (DNA, mRNA, protein, metabolites) from highly relevant tissues such as fly brains. Other challenges consist of collecting large numbers of flies for experimental replicates (critical for statistical robustness) and developing consistent procedures for the purification of high-quality biological material. Here, we describe the procedures for collecting thousands of fly heads and the extraction of transcripts and metabolites to understand how global changes in gene expression and metabolism contribute to neurodegenerative diseases. These procedures are easily scalable and can be applied to the study of proteomic and epigenomic contributions to disease.
Genetics, Issue 73, Biochemistry, Molecular Biology, Neurobiology, Neuroscience, Bioengineering, Cellular Biology, Anatomy, Neurodegenerative Diseases, Biological Assay, Drosophila, fruit fly, head separation, purification, mRNA, RNA, cDNA, DNA, transcripts, metabolites, replicates, SCA3, neurodegeneration, NMR, gene expression, animal model
A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions
Institutions: Institut Pasteur , Université Sorbonne Paris Cité, Dana Farber Cancer Institute.
Significant efforts were gathered to generate large-scale comprehensive protein-protein interaction network maps. This is instrumental to understand the pathogen-host relationships and was essentially performed by genetic screenings in yeast two-hybrid systems. The recent improvement of protein-protein interaction detection by a Gaussia
luciferase-based fragment complementation assay now offers the opportunity to develop integrative comparative interactomic approaches necessary to rigorously compare interaction profiles of proteins from different pathogen strain variants against a common set of cellular factors.
This paper specifically focuses on the utility of combining two orthogonal methods to generate protein-protein interaction datasets: yeast two-hybrid (Y2H) and a new assay, high-throughput Gaussia princeps
protein complementation assay (HT-GPCA) performed in mammalian cells.
A large-scale identification of cellular partners of a pathogen protein is performed by mating-based yeast two-hybrid screenings of cDNA libraries using multiple pathogen strain variants. A subset of interacting partners selected on a high-confidence statistical scoring is further validated in mammalian cells for pair-wise interactions with the whole set of pathogen variants proteins using HT-GPCA. This combination of two complementary methods improves the robustness of the interaction dataset, and allows the performance of a stringent comparative interaction analysis. Such comparative interactomics constitute a reliable and powerful strategy to decipher any pathogen-host interplays.
Immunology, Issue 77, Genetics, Microbiology, Biochemistry, Molecular Biology, Cellular Biology, Biomedical Engineering, Infection, Cancer Biology, Virology, Medicine, Host-Pathogen Interactions, Host-Pathogen Interactions, Protein-protein interaction, High-throughput screening, Luminescence, Yeast two-hybrid, HT-GPCA, Network, protein, yeast, cell, culture
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Detection of the Genome and Transcripts of a Persistent DNA Virus in Neuronal Tissues by Fluorescent In situ Hybridization Combined with Immunostaining
Institutions: CNRS UMR 5534, Université de Lyon 1, LabEX DEVweCAN, CNRS UPR 3296, CNRS UMR 5286.
Single cell codetection of a gene, its RNA product and cellular regulatory proteins is critical to study gene expression regulation. This is a challenge in the field of virology; in particular for nuclear-replicating persistent DNA viruses that involve animal models for their study. Herpes simplex virus type 1 (HSV-1) establishes a life-long latent infection in peripheral neurons. Latent virus serves as reservoir, from which it reactivates and induces a new herpetic episode. The cell biology of HSV-1 latency remains poorly understood, in part due to the lack of methods to detect HSV-1 genomes in situ
in animal models. We describe a DNA-fluorescent in situ
hybridization (FISH) approach efficiently detecting low-copy viral genomes within sections of neuronal tissues from infected animal models. The method relies on heat-based antigen unmasking, and directly labeled home-made DNA probes, or commercially available probes. We developed a triple staining approach, combining DNA-FISH with RNA-FISH and immunofluorescence, using peroxidase based signal amplification to accommodate each staining requirement. A major improvement is the ability to obtain, within 10 µm tissue sections, low-background signals that can be imaged at high resolution by confocal microscopy and wide-field conventional epifluorescence. Additionally, the triple staining worked with a wide range of antibodies directed against cellular and viral proteins. The complete protocol takes 2.5 days to accommodate antibody and probe penetration within the tissue.
Neuroscience, Issue 83, Life Sciences (General), Virology, Herpes Simplex Virus (HSV), Latency, In situ hybridization, Nuclear organization, Gene expression, Microscopy
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
Flat Mount Preparation for Observation and Analysis of Zebrafish Embryo Specimens Stained by Whole Mount In situ Hybridization
Institutions: University of Notre Dame.
The zebrafish embryo is now commonly used for basic and biomedical research to investigate the genetic control of developmental processes and to model congenital abnormalities. During the first day of life, the zebrafish embryo progresses through many developmental stages including fertilization, cleavage, gastrulation, segmentation, and the organogenesis of structures such as the kidney, heart, and central nervous system. The anatomy of a young zebrafish embryo presents several challenges for the visualization and analysis of the tissues involved in many of these events because the embryo develops in association with a round yolk mass. Thus, for accurate analysis and imaging of experimental phenotypes in fixed embryonic specimens between the tailbud and 20 somite stage (10 and 19 hours post fertilization (hpf), respectively), such as those stained using whole mount in situ
hybridization (WISH), it is often desirable to remove the embryo from the yolk ball and to position it flat on a glass slide. However, performing a flat mount procedure can be tedious. Therefore, successful and efficient flat mount preparation is greatly facilitated through the visual demonstration of the dissection technique, and also helped by using reagents that assist in optimal tissue handling. Here, we provide our WISH protocol for one or two-color detection of gene expression in the zebrafish embryo, and demonstrate how the flat mounting procedure can be performed on this example of a stained fixed specimen. This flat mounting protocol is broadly applicable to the study of many embryonic structures that emerge during early zebrafish development, and can be implemented in conjunction with other staining methods performed on fixed embryo samples.
Developmental Biology, Issue 89, animals, vertebrates, fishes, zebrafish, growth and development, morphogenesis, embryonic and fetal development, organogenesis, natural science disciplines, embryo, whole mount in situ hybridization, flat mount, deyolking, imaging
Analysis of Nephron Composition and Function in the Adult Zebrafish Kidney
Institutions: University of Notre Dame.
The zebrafish model has emerged as a relevant system to study kidney development, regeneration and disease. Both the embryonic and adult zebrafish kidneys are composed of functional units known as nephrons, which are highly conserved with other vertebrates, including mammals. Research in zebrafish has recently demonstrated that two distinctive phenomena transpire after adult nephrons incur damage: first, there is robust regeneration within existing nephrons that replaces the destroyed tubule epithelial cells; second, entirely new nephrons are produced from renal progenitors in a process known as neonephrogenesis. In contrast, humans and other mammals seem to have only a limited ability for nephron epithelial regeneration. To date, the mechanisms responsible for these kidney regeneration phenomena remain poorly understood. Since adult zebrafish kidneys undergo both nephron epithelial regeneration and neonephrogenesis, they provide an outstanding experimental paradigm to study these events. Further, there is a wide range of genetic and pharmacological tools available in the zebrafish model that can be used to delineate the cellular and molecular mechanisms that regulate renal regeneration. One essential aspect of such research is the evaluation of nephron structure and function. This protocol describes a set of labeling techniques that can be used to gauge renal composition and test nephron functionality in the adult zebrafish kidney. Thus, these methods are widely applicable to the future phenotypic characterization of adult zebrafish kidney injury paradigms, which include but are not limited to, nephrotoxicant exposure regimes or genetic methods of targeted cell death such as the nitroreductase mediated cell ablation technique. Further, these methods could be used to study genetic perturbations in adult kidney formation and could also be applied to assess renal status during chronic disease modeling.
Cellular Biology, Issue 90,
zebrafish; kidney; nephron; nephrology; renal; regeneration; proximal tubule; distal tubule; segment; mesonephros; physiology; acute kidney injury (AKI)
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
Highly Efficient Ligation of Small RNA Molecules for MicroRNA Quantitation by High-Throughput Sequencing
Institutions: University of Colorado, Boulder, University of Colorado, Denver.
MiRNA cloning and high-throughput sequencing, termed miR-Seq, stands alone as a transcriptome-wide approach to quantify miRNAs with single nucleotide resolution. This technique captures miRNAs by attaching 3’ and 5’ oligonucleotide adapters to miRNA molecules and allows de novo
miRNA discovery. Coupling with powerful next-generation sequencing platforms, miR-Seq has been instrumental in the study of miRNA biology. However, significant biases introduced by oligonucleotide ligation steps have prevented miR-Seq from being employed as an accurate quantitation tool. Previous studies demonstrate that biases in current miR-Seq methods often lead to inaccurate miRNA quantification with errors up to 1,000-fold for some miRNAs1,2
. To resolve these biases imparted by RNA ligation, we have developed a small RNA ligation method that results in ligation efficiencies of over 95% for both 3’ and 5′ ligation steps. Benchmarking this improved library construction method using equimolar or differentially mixed synthetic miRNAs, consistently yields reads numbers with less than two-fold deviation from the expected value. Furthermore, this high-efficiency miR-Seq method permits accurate genome-wide miRNA profiling from in vivo
total RNA samples2
Molecular Biology, Issue 93, RNA, ligation, miRNA, miR-Seq, linker, oligonucleotide, high-throughput sequencing
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Substrate Generation for Endonucleases of CRISPR/Cas Systems
Institutions: Max-Planck-Institute for Terrestrial Microbiology.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1) 1-3
. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4
. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5
. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8
. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9
. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10
. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro
T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology, Issue 67, CRISPR/Cas, endonuclease, in vitro transcription, crRNA, Cas6
Analyzing Gene Expression from Marine Microbial Communities using Environmental Transcriptomics
Institutions: University of Georgia (UGA).
Analogous to metagenomics, environmental transcriptomics (metatranscriptomics) retrieves and sequences environmental mRNAs from a microbial assemblage without prior knowledge of what genes the community might be expressing. Thus it provides the most unbiased perspective on community gene expression in situ
. Environmental transcriptomics protocols are technically difficult since prokaryotic mRNAs generally lack the poly(A) tails that make isolation of eukaryotic messages relatively straightforward 1
and because of the relatively short half lives of mRNAs 2
. In addition, mRNAs are much less abundant than rRNAs in total RNA extracts, thus an rRNA background often overwhelms mRNA signals. However, techniques for overcoming some of these difficulties have recently been developed. A procedure for analyzing environmental transcriptomes by creating clone libraries using random primers to reverse-transcribe and amplify environmental mRNAs was recently described was successful in two different natural environments, but results were biased by selection of the random primers used to initiate cDNA synthesis 3
. Advances in linear amplification of mRNA obviate the need for random primers in the amplification step and make it possible to use less starting material decreasing the collection and processing time of samples and thereby minimizing RNA degradation 4
. In vitro
transcription methods for amplifying mRNA involve polyadenylating the mRNA and incorporating a T7 promoter onto the 3 end of the transcript. Amplified RNA (aRNA) can then be converted to double stranded cDNA using random hexamers and directly sequenced by pyrosequencing 5
. A first use of this method at Station ALOHA demonstrated its utility for characterizing microbial community gene expression 6
Microbiology, Issue 24, transcriptomics, bacterioplankton, mRNA, microbial communities, gene expression
Split-Ubiquitin Based Membrane Yeast Two-Hybrid (MYTH) System: A Powerful Tool For Identifying Protein-Protein Interactions
Institutions: University of Toronto, University of Toronto, University of Toronto.
The fundamental biological and clinical importance of integral membrane proteins prompted the development of a yeast-based system for the high-throughput identification of protein-protein interactions (PPI) for full-length transmembrane proteins. To this end, our lab developed the split-ubiquitin based Membrane Yeast Two-Hybrid (MYTH) system. This technology allows for the sensitive detection of transient and stable protein interactions using Saccharomyces cerevisiae
as a host organism. MYTH takes advantage of the observation that ubiquitin can be separated into two stable moieties: the C-terminal half of yeast ubiquitin (Cub
) and the N-terminal half of the ubiquitin moiety (Nub
). In MYTH, this principle is adapted for use as a 'sensor' of protein-protein interactions. Briefly, the integral membrane bait protein is fused to Cub
which is linked to an artificial transcription factor. Prey proteins, either in individual or library format, are fused to the Nub
moiety. Protein interaction between the bait and prey leads to reconstitution of the ubiquitin moieties, forming a full-length 'pseudo-ubiquitin' molecule. This molecule is in turn recognized by cytosolic deubiquitinating enzymes, resulting in cleavage of the transcription factor, and subsequent induction of reporter gene expression. The system is highly adaptable, and is particularly well-suited to high-throughput screening. It has been successfully employed to investigate interactions using integral membrane proteins from both yeast and other organisms.
Cellular Biology, Issue 36, protein-protein interaction, membrane, split-ubiquitin, yeast, library screening, Y2H, yeast two-hybrid, MYTH
An Allele-specific Gene Expression Assay to Test the Functional Basis of Genetic Associations
Institutions: University of Oxford.
The number of significant genetic associations with common complex traits is constantly increasing. However, most of these associations have not been understood at molecular level. One of the mechanisms mediating the effect of DNA variants on phenotypes is gene expression, which has been shown to be particularly relevant for complex traits1
This method tests in a cellular context the effect of specific DNA sequences on gene expression. The principle is to measure the relative abundance of transcripts arising from the two alleles of a gene, analysing cells which carry one copy of the DNA sequences associated with disease (the risk variants)2,3
. Therefore, the cells used for this method should meet two fundamental genotypic requirements: they have to be heterozygous both for DNA risk variants and for DNA markers, typically coding polymorphisms, which can distinguish transcripts based on their chromosomal origin (Figure 1). DNA risk variants and DNA markers do not need to have the same allele frequency but the phase (haplotypic) relationship of the genetic markers needs to be understood. It is also important to choose cell types which express the gene of interest. This protocol refers specifically to the procedure adopted to extract nucleic acids from fibroblasts but the method is equally applicable to other cells types including primary cells.
DNA and RNA are extracted from the selected cell lines and cDNA is generated. DNA and cDNA are analysed with a primer extension assay, designed to target the coding DNA markers4
. The primer extension assay is carried out using the MassARRAY (Sequenom)5
platform according to the manufacturer's specifications. Primer extension products are then analysed by matrix-assisted laser desorption/ionization time of-flight mass spectrometry (MALDI-TOF/MS). Because the selected markers are heterozygous they will generate two peaks on the MS profiles. The area of each peak is proportional to the transcript abundance and can be measured with a function of the MassARRAY Typer software to generate an allelic ratio (allele 1: allele 2) calculation. The allelic ratio obtained for cDNA is normalized using that measured from genomic DNA, where the allelic ratio is expected to be 1:1 to correct for technical artifacts. Markers with a normalised allelic ratio significantly different to 1 indicate that the amount of transcript generated from the two chromosomes in the same cell is different, suggesting that the DNA variants associated with the phenotype have an effect on gene expression. Experimental controls should be used to confirm the results.
Cellular Biology, Issue 45, Gene expression, regulatory variant, haplotype, association study, primer extension, MALDI-TOF mass spectrometry, single nucleotide polymorphism, allele-specific
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
Engineering and Evolution of Synthetic Adeno-Associated Virus (AAV) Gene Therapy Vectors via DNA Family Shuffling
Institutions: Heidelberg University, Heidelberg University.
Adeno-associated viral (AAV) vectors represent some of the most potent and promising vehicles for therapeutic human gene transfer due to a unique combination of beneficial properties1
. These include the apathogenicity of the underlying wildtype viruses and the highly advanced methodologies for production of high-titer, high-purity and clinical-grade recombinant vectors2
. A further particular advantage of the AAV system over other viruses is the availability of a wealth of naturally occurring serotypes which differ in essential properties yet can all be easily engineered as vectors using a common protocol1,2
. Moreover, a number of groups including our own have recently devised strategies to use these natural viruses as templates for the creation of synthetic vectors which either combine the assets of multiple input serotypes, or which enhance the properties of a single isolate. The respective technologies to achieve these goals are either DNA family shuffling3
fragmentation of various AAV capsid genes followed by their re-assembly based on partial homologies (typically >80% for most AAV serotypes), or peptide display4,5
insertion of usually seven amino acids into an exposed loop of the viral capsid where the peptide ideally mediates re-targeting to a desired cell type. For maximum success, both methods are applied in a high-throughput fashion whereby the protocols are up-scaled to yield libraries of around one million distinct capsid variants. Each clone is then comprised of a unique combination of numerous parental viruses (DNA shuffling approach) or contains a distinctive peptide within the same viral backbone (peptide display approach). The subsequent final step is iterative selection of such a library on target cells in order to enrich for individual capsids fulfilling most or ideally all requirements of the selection process. The latter preferably combines positive pressure, such as growth on a certain cell type of interest, with negative selection, for instance elimination of all capsids reacting with anti-AAV antibodies. This combination increases chances that synthetic capsids surviving the selection match the needs of the given application in a manner that would probably not have been found in any naturally occurring AAV isolate. Here, we focus on the DNA family shuffling method as the theoretically and experimentally more challenging of the two technologies. We describe and demonstrate all essential steps for the generation and selection of shuffled AAV libraries (Fig. 1
), and then discuss the pitfalls and critical aspects of the protocols that one needs to be aware of in order to succeed with molecular AAV evolution.
Immunology, Issue 62, Adeno-associated virus, AAV, gene therapy, synthetic biology, viral vector, molecular evolution, DNA shuffling
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Reverse Genetics Mediated Recovery of Infectious Murine Norovirus
Institutions: Imperial College London .
Human noroviruses are responsible for most cases of human gastroenteritis (GE) worldwide and are recurrent problem in environments where close person-to-person contact cannot be avoided 1, 2
. During the last few years an increase in the incidence of outbreaks in hospitals has been reported, causing significant disruptions to their operational capacity as well as large economic losses. The identification of new antiviral approaches has been limited due to the inability of human noroviruses to complete a productive infection in cell culture 3
. The recent isolation of a murine norovirus (MNV), closely related to human norovirus 4
but which can be propagated in cells 5
has opened new avenues for the investigation of these pathogens 6, 7
MNV replication results in the synthesis of new positive sense genomic and subgenomic RNA molecules, the latter of which corresponds to the last third of the viral genome (Figure 1
). MNV contains four different open reading frames (ORFs), of which ORF1 occupies most of the genome and encodes seven non-structural proteins (NS1-7) released from a polyprotein precursor. ORF2 and ORF3 are contained within the subgenomic RNA region and encode the capsid proteins (VP1 and VP2, respectively) (Figure 1
). Recently, we have identified that additional ORF4 overlapping ORF2 but in a different reading frame is functional and encodes for a mitochondrial localised virulence factor (VF1) 8
Replication for positive sense RNA viruses, including noroviruses, takes place in the cytoplasm resulting in the synthesis of new uncapped RNA genomes. To promote viral translation, viruses exploit different strategies aimed at recruiting the cellular protein synthesis machinery 9-11
. Interestingly, norovirus translation is driven by the multifunctional viral protein-primer VPg covalently linked to the 5' end of both genomic and subgenomic RNAs 12-14
. This sophisticated mechanism of translation is likely to be a major factor in the limited efficiency of viral recovery by conventional reverse genetics approaches.
Here we report two different strategies based on the generation of murine norovirus-1 (referred to as MNV herewith) transcripts capped at the 5' end. One of the methods involves both in vitro
synthesis and capping of viral RNA, whereas the second approach entails the transcription of MNV cDNA in cells expressing T7 RNA polymerase. The availability of these reverse genetics systems for the study of MNV and a small animal model has provided an unprecedented ability to dissect the role of viral sequences in replication and pathogenesis 15-17
Virology, Issue 64, Immunology, Genetics, Infection, RNA virus, VPg, RNA capping, T7 RNA polymerase, calicivirus, norovirus
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Pyrosequencing: A Simple Method for Accurate Genotyping
Institutions: Washington University in St. Louis.
Pharmacogenetic research benefits first-hand from the abundance of information provided by the completion of the Human Genome Project. With such a tremendous amount of data available comes an explosion of genotyping methods. Pyrosequencing(R) is one of the most thorough yet simple methods to date used to analyze polymorphisms. It also has the ability to identify tri-allelic, indels, short-repeat polymorphisms, along with determining allele percentages for methylation or pooled sample assessment. In addition, there is a standardized control sequence that provides internal quality control. This method has led to rapid and efficient single-nucleotide polymorphism evaluation including many clinically relevant polymorphisms. The technique and methodology of Pyrosequencing is explained.
Cellular Biology, Issue 11, Springer Protocols, Pyrosequencing, genotype, polymorphism, SNP, pharmacogenetics, pharmacogenomics, PCR