Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro and in vivo.
18 Related JoVE Articles!
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
Ablation of a Single Cell From Eight-cell Embryos of the Amphipod Crustacean Parhyale hawaiensis
Institutions: Harvard University.
The amphipod Parhyale hawaiensis
is a small crustacean found in intertidal marine habitats worldwide. Over the past decade, Parhyale
has emerged as a promising model organism for laboratory studies of development, providing a useful outgroup comparison to the well studied arthropod model organism Drosophila melanogaster
. In contrast to the syncytial cleavages of Drosophila
, the early cleavages of Parhyale
are holoblastic. Fate mapping using tracer dyes injected into early blastomeres have shown that all three germ layers and the germ line are established by the eight-cell stage. At this stage, three blastomeres are fated to give rise to the ectoderm, three are fated to give rise to the mesoderm, and the remaining two blastomeres are the precursors of the endoderm and germ line respectively. However, blastomere ablation experiments have shown that Parhyale
embryos also possess significant regulatory capabilities, such that the fates of blastomeres ablated at the eight-cell stage can be taken over by the descendants of some of the remaining blastomeres. Blastomere ablation has previously been described by one of two methods: injection and subsequent activation of phototoxic dyes or manual ablation. However, photoablation kills blastomeres but does not remove the dead cell body from the embryo. Complete physical removal of specific blastomeres may therefore be a preferred method of ablation for some applications. Here we present a protocol for manual removal of single blastomeres from the eight-cell stage of Parhyale
embryos, illustrating the instruments and manual procedures necessary for complete removal of the cell body while keeping the remaining blastomeres alive and intact. This protocol can be applied to any Parhyale
cell at the eight-cell stage, or to blastomeres of other early cleavage stages. In addition, in principle this protocol could be applicable to early cleavage stage embryos of other holoblastically cleaving marine invertebrates.
Developmental Biology, Issue 85, Amphipod, experimental embryology, micromere, germ line, ablation, developmental potential, vasa
A Manual Small Molecule Screen Approaching High-throughput Using Zebrafish Embryos
Institutions: University of Notre Dame.
Zebrafish have become a widely used model organism to investigate the mechanisms that underlie developmental biology and to study human disease pathology due to their considerable degree of genetic conservation with humans. Chemical genetics entails testing the effect that small molecules have on a biological process and is becoming a popular translational research method to identify therapeutic compounds. Zebrafish are specifically appealing to use for chemical genetics because of their ability to produce large clutches of transparent embryos, which are externally fertilized. Furthermore, zebrafish embryos can be easily drug treated by the simple addition of a compound to the embryo media. Using whole-mount in situ
hybridization (WISH), mRNA expression can be clearly visualized within zebrafish embryos. Together, using chemical genetics and WISH, the zebrafish becomes a potent whole organism context in which to determine the cellular and physiological effects of small molecules. Innovative advances have been made in technologies that utilize machine-based screening procedures, however for many labs such options are not accessible or remain cost-prohibitive. The protocol described here explains how to execute a manual high-throughput chemical genetic screen that requires basic resources and can be accomplished by a single individual or small team in an efficient period of time. Thus, this protocol provides a feasible strategy that can be implemented by research groups to perform chemical genetics in zebrafish, which can be useful for gaining fundamental insights into developmental processes, disease mechanisms, and to identify novel compounds and signaling pathways that have medically relevant applications.
Developmental Biology, Issue 93, zebrafish, chemical genetics, chemical screen, in vivo small molecule screen, drug discovery, whole mount in situ hybridization (WISH), high-throughput screening (HTS), high-content screening (HCS)
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Dissection and Immunostaining of Imaginal Discs from Drosophila melanogaster
Institutions: Indiana University.
A significant portion of post-embryonic development in the fruit fly, Drosophila melanogaster
, takes place within a set of sac-like structures called imaginal discs. These discs give rise to a high percentage of adult structures that are found within the adult fly. Here we describe a protocol that has been optimized to recover these discs and prepare them for analysis with antibodies, transcriptional reporters and protein traps. This procedure is best suited for thin tissues like imaginal discs, but can be easily modified for use with thicker tissues such as the larval brain and adult ovary. The written protocol and accompanying video will guide the reader/viewer through the dissection of third instar larvae, fixation of tissue, and treatment of imaginal discs with antibodies. The protocol can be used to dissect imaginal discs from younger first and second instar larvae as well. The advantage of this protocol is that it is relatively short and it has been optimized for the high quality preservation of the dissected tissue. Another advantage is that the fixation procedure that is employed works well with the overwhelming number of antibodies that recognize Drosophila
proteins. In our experience, there is a very small number of sensitive antibodies that do not work well with this procedure. In these situations, the remedy appears to be to use an alternate fixation cocktail while continuing to follow the guidelines that we have set forth for the dissection steps and antibody incubations.
Cellular Biology, Issue 91, Drosophila, imaginal discs, eye, retina, dissection, developmental biology
Analysis of Nephron Composition and Function in the Adult Zebrafish Kidney
Institutions: University of Notre Dame.
The zebrafish model has emerged as a relevant system to study kidney development, regeneration and disease. Both the embryonic and adult zebrafish kidneys are composed of functional units known as nephrons, which are highly conserved with other vertebrates, including mammals. Research in zebrafish has recently demonstrated that two distinctive phenomena transpire after adult nephrons incur damage: first, there is robust regeneration within existing nephrons that replaces the destroyed tubule epithelial cells; second, entirely new nephrons are produced from renal progenitors in a process known as neonephrogenesis. In contrast, humans and other mammals seem to have only a limited ability for nephron epithelial regeneration. To date, the mechanisms responsible for these kidney regeneration phenomena remain poorly understood. Since adult zebrafish kidneys undergo both nephron epithelial regeneration and neonephrogenesis, they provide an outstanding experimental paradigm to study these events. Further, there is a wide range of genetic and pharmacological tools available in the zebrafish model that can be used to delineate the cellular and molecular mechanisms that regulate renal regeneration. One essential aspect of such research is the evaluation of nephron structure and function. This protocol describes a set of labeling techniques that can be used to gauge renal composition and test nephron functionality in the adult zebrafish kidney. Thus, these methods are widely applicable to the future phenotypic characterization of adult zebrafish kidney injury paradigms, which include but are not limited to, nephrotoxicant exposure regimes or genetic methods of targeted cell death such as the nitroreductase mediated cell ablation technique. Further, these methods could be used to study genetic perturbations in adult kidney formation and could also be applied to assess renal status during chronic disease modeling.
Cellular Biology, Issue 90,
zebrafish; kidney; nephron; nephrology; renal; regeneration; proximal tubule; distal tubule; segment; mesonephros; physiology; acute kidney injury (AKI)
An Affordable HIV-1 Drug Resistance Monitoring Method for Resource Limited Settings
Institutions: University of KwaZulu-Natal, Durban, South Africa, Jembi Health Systems, University of Amsterdam, Stanford Medical School.
HIV-1 drug resistance has the potential to seriously compromise the effectiveness and impact of antiretroviral therapy (ART). As ART programs in sub-Saharan Africa continue to expand, individuals on ART should be closely monitored for the emergence of drug resistance. Surveillance of transmitted drug resistance to track transmission of viral strains already resistant to ART is also critical. Unfortunately, drug resistance testing is still not readily accessible in resource limited settings, because genotyping is expensive and requires sophisticated laboratory and data management infrastructure. An open access genotypic drug resistance monitoring method to manage individuals and assess transmitted drug resistance is described. The method uses free open source software for the interpretation of drug resistance patterns and the generation of individual patient reports. The genotyping protocol has an amplification rate of greater than 95% for plasma samples with a viral load >1,000 HIV-1 RNA copies/ml. The sensitivity decreases significantly for viral loads <1,000 HIV-1 RNA copies/ml. The method described here was validated against a method of HIV-1 drug resistance testing approved by the United States Food and Drug Administration (FDA), the Viroseq genotyping method. Limitations of the method described here include the fact that it is not automated and that it also failed to amplify the circulating recombinant form CRF02_AG from a validation panel of samples, although it amplified subtypes A and B from the same panel.
Medicine, Issue 85, Biomedical Technology, HIV-1, HIV Infections, Viremia, Nucleic Acids, genetics, antiretroviral therapy, drug resistance, genotyping, affordable
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
Genome-wide Screen for miRNA Targets Using the MISSION Target ID Library
The Target ID Library is designed to assist in discovery and identification of microRNA (miRNA) targets. The Target ID Library is a plasmid-based, genome-wide cDNA library cloned into the 3'UTR downstream from the dual-selection fusion protein, thymidine kinase-zeocin (TKzeo). The first round of selection is for stable transformants, followed with introduction of a miRNA of interest, and finally, selecting for cDNAs containing the miRNA's target. Selected cDNAs are identified by sequencing (see Figure 1-3 for Target ID Library Workflow and details).
To ensure broad coverage of the human transcriptome, Target ID Library cDNAs were generated via oligo-dT priming using a pool of total RNA prepared from multiple human tissues and cell lines. Resulting cDNA range from 0.5 to 4 kb, with an average size of 1.2 kb, and were cloned into the p3΄TKzeo dual-selection plasmid (see Figure 4 for plasmid map). The gene targets represented in the library can be found on the Sigma-Aldrich webpage. Results from Illumina sequencing (Table 3
), show that the library includes 16,922 of the 21,518 unique genes in UCSC RefGene (79%), or 14,000 genes with 10 or more reads (66%).
Genetics, Issue 62, Target ID, miRNA, ncRNA, RNAi, genomics
An Analytical Tool-box for Comprehensive Biochemical, Structural and Transcriptome Evaluation of Oral Biofilms Mediated by Mutans Streptococci
Institutions: University of Rochester Medical Center, Sichuan University, Glostrup Hospital, Glostrup, Denmark, University of Rochester Medical Center.
Biofilms are highly dynamic, organized and structured communities of microbial cells enmeshed in an extracellular matrix of variable density and composition 1, 2
. In general, biofilms develop from initial microbial attachment on a surface followed by formation of cell clusters (or microcolonies) and further development and stabilization of the microcolonies, which occur in a complex extracellular matrix. The majority of biofilm matrices harbor exopolysaccharides (EPS), and dental biofilms are no exception; especially those associated with caries disease, which are mostly mediated by mutans streptococci 3
. The EPS are synthesized by microorganisms (S. mutans
, a key contributor) by means of extracellular enzymes, such as glucosyltransferases using sucrose primarily as substrate 3
Studies of biofilms formed on tooth surfaces are particularly challenging owing to their constant exposure to environmental challenges associated with complex diet-host-microbial interactions occurring in the oral cavity. Better understanding of the dynamic changes of the structural organization and composition of the matrix, physiology and transcriptome/proteome profile of biofilm-cells in response to these complex interactions would further advance the current knowledge of how oral biofilms modulate pathogenicity. Therefore, we have developed an analytical tool-box to facilitate biofilm analysis at structural, biochemical and molecular levels by combining commonly available and novel techniques with custom-made software for data analysis. Standard analytical (colorimetric assays, RT-qPCR and microarrays) and novel fluorescence techniques (for simultaneous labeling of bacteria and EPS) were integrated with specific software for data analysis to address the complex nature of oral biofilm research.
The tool-box is comprised of 4 distinct but interconnected steps (Figure 1): 1) Bioassays, 2) Raw Data Input, 3) Data Processing, and 4) Data Analysis. We used our in vitro
biofilm model and specific experimental conditions to demonstrate the usefulness and flexibility of the tool-box. The biofilm model is simple, reproducible and multiple replicates of a single experiment can be done simultaneously 4, 5
. Moreover, it allows temporal evaluation, inclusion of various microbial species 5
and assessment of the effects of distinct experimental conditions (e.g. treatments 6
; comparison of knockout mutants vs. parental strain 5
; carbohydrates availability 7
). Here, we describe two specific components of the tool-box, including (i) new software for microarray data mining/organization (MDV) and fluorescence imaging analysis (DUOSTAT), and (ii) in situ
EPS-labeling. We also provide an experimental case showing how the tool-box can assist with biofilms analysis, data organization, integration and interpretation.
Microbiology, Issue 47, Extracellular matrix, polysaccharides, biofilm, mutans streptococci, glucosyltransferases, confocal fluorescence, microarray
PAR-CliP - A Method to Identify Transcriptome-wide the Binding Sites of RNA Binding Proteins
Institutions: Rockefeller University, Max-Delbrück-Center for Molecular Medicine, Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Rockefeller University.
RNA transcripts are subjected to post-transcriptional gene regulation by interacting with hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) that are often expressed in a cell-type dependently. To understand how the interplay of these RNA-binding factors affects the regulation of individual transcripts, high resolution maps of in vivo
protein-RNA interactions are necessary1
A combination of genetic, biochemical and computational approaches are typically applied to identify RNA-RBP or RNA-RNP interactions. Microarray profiling of RNAs associated with immunopurified RBPs (RIP-Chip)2
defines targets at a transcriptome level, but its application is limited to the characterization of kinetically stable interactions and only in rare cases3,4
allows to identify the RBP recognition element (RRE) within the long target RNA. More direct RBP target site information is obtained by combining in vivo
followed by the isolation of crosslinked RNA segments and cDNA sequencing (CLIP)10
. CLIP was used to identify targets of a number of RBPs11-17
. However, CLIP is limited by the low efficiency of UV 254 nm RNA-protein crosslinking, and the location of the crosslink is not readily identifiable within the sequenced crosslinked fragments, making it difficult to separate UV-crosslinked target RNA segments from background non-crosslinked RNA fragments also present in the sample.
We developed a powerful cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs that we term PAR-CliP (Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation) (see Fig. 1A for an outline of the method). The method relies on the incorporation of photoreactive ribonucleoside analogs, such as 4-thiouridine (4-SU) and 6-thioguanosine (6-SG) into nascent RNA transcripts by living cells. Irradiation of the cells by UV light of 365 nm induces efficient crosslinking of photoreactive nucleoside-labeled cellular RNAs to interacting RBPs. Immunoprecipitation of the RBP of interest is followed by isolation of the crosslinked and coimmunoprecipitated RNA. The isolated RNA is converted into a cDNA library and deep sequenced using Solexa technology. One characteristic feature of cDNA libraries prepared by PAR-CliP is that the precise position of crosslinking can be identified by mutations residing in the sequenced cDNA. When using 4-SU, crosslinked sequences thymidine to cytidine transition, whereas using 6-SG results in guanosine to adenosine mutations. The presence of the mutations in crosslinked sequences makes it possible to separate them from the background of sequences derived from abundant cellular RNAs.
Application of the method to a number of diverse RNA binding proteins was reported in Hafner et al.18
Cellular Biology, Issue 41, UV crosslinking, RNA binding proteins, RNA binding motif, 4-thiouridine, 6-thioguanosine
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution