Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
22 Related JoVE Articles!
Extracting DNA from the Gut Microbes of the Termite (Zootermopsis Angusticollis) and Visualizing Gut Microbes
Institutions: California Institute of Technology - Caltech.
Termites are among the few animals known to have the capacity to subsist solely by consuming wood. The termite gut tract contains a dense and species-rich microbial population that assists in the degradation of lignocellulose predominantly into acetate, the key nutrient fueling termite metabolism (Odelson & Breznak, 1983). Within these microbial populations are bacteria, methanogenic archaea and, in some ("lower") termites, eukaryotic protozoa. Thus, termites are excellent research subjects for studying the interactions among microbial species and the numerous biochemical functions they perform to the benefit of their host. The species composition of microbial populations in termite guts as well as key genes involved in various biochemical processes has been explored using molecular techniques (Kudo et al., 1998; Schmit-Wagner et al., 2003; Salmassi & Leadbetter, 2003). These techniques depend on the extraction and purification of high-quality nucleic acids from the termite gut environment. The extraction technique described in this video is a modified compilation of protocols developed for extraction and purification of nucleic acids from environmental samples (Mor et al., 1994; Berthelet et al., 1996; Purdy et al., 1996; Salmassi & Leadbetter, 2003; Ottesen et al. 2006) and it produces DNA from termite hindgut material suitable for use as template for polymerase chain reaction (PCR).
Microbiology, issue 4, microbial community, DNA, extraction, gut, termite
Profiling Individual Human Embryonic Stem Cells by Quantitative RT-PCR
Institutions: Johns Hopkins University School of Medicine.
Heterogeneity of stem cell population hampers detailed understanding of stem cell biology, such as their differentiation propensity toward different lineages. A single cell transcriptome assay can be a new approach for dissecting individual variation. We have developed the single cell qRT-PCR method, and confirmed that this method works well in several gene expression profiles. In single cell level, each human embryonic stem cell, sorted by OCT4::EGFP positive cells, has high expression in OCT4
, but a different level of NANOG
expression. Our single cell gene expression assay should be useful to interrogate population heterogeneities.
Molecular Biology, Issue 87, Single cell, heterogeneity, Amplification, qRT-PCR, Reverse transcriptase, human Embryonic Stem cell, FACS
Genome-wide Screen for miRNA Targets Using the MISSION Target ID Library
The Target ID Library is designed to assist in discovery and identification of microRNA (miRNA) targets. The Target ID Library is a plasmid-based, genome-wide cDNA library cloned into the 3'UTR downstream from the dual-selection fusion protein, thymidine kinase-zeocin (TKzeo). The first round of selection is for stable transformants, followed with introduction of a miRNA of interest, and finally, selecting for cDNAs containing the miRNA's target. Selected cDNAs are identified by sequencing (see Figure 1-3 for Target ID Library Workflow and details).
To ensure broad coverage of the human transcriptome, Target ID Library cDNAs were generated via oligo-dT priming using a pool of total RNA prepared from multiple human tissues and cell lines. Resulting cDNA range from 0.5 to 4 kb, with an average size of 1.2 kb, and were cloned into the p3΄TKzeo dual-selection plasmid (see Figure 4 for plasmid map). The gene targets represented in the library can be found on the Sigma-Aldrich webpage. Results from Illumina sequencing (Table 3
), show that the library includes 16,922 of the 21,518 unique genes in UCSC RefGene (79%), or 14,000 genes with 10 or more reads (66%).
Genetics, Issue 62, Target ID, miRNA, ncRNA, RNAi, genomics
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Using Microfluidics Chips for Live Imaging and Study of Injury Responses in Drosophila Larvae
Institutions: University of Michigan, University of Michigan, University of Michigan, University of Michigan, University of Michigan.
Live imaging is an important technique for studying cell biological processes, however this can be challenging in live animals. The translucent cuticle of the Drosophila
larva makes it an attractive model organism for live imaging studies. However, an important challenge for live imaging techniques is to noninvasively immobilize and position an animal on the microscope. This protocol presents a simple and easy to use method for immobilizing and imaging Drosophila
larvae on a polydimethylsiloxane (PDMS) microfluidic device, which we call the 'larva chip'. The larva chip is comprised of a snug-fitting PDMS microchamber that is attached to a thin glass coverslip, which, upon application of a vacuum via a syringe, immobilizes the animal and brings ventral structures such as the nerve cord, segmental nerves, and body wall muscles, within close proximity to the coverslip. This allows for high-resolution imaging, and importantly, avoids the use of anesthetics and chemicals, which facilitates the study of a broad range of physiological processes. Since larvae recover easily from the immobilization, they can be readily subjected to multiple imaging sessions. This allows for longitudinal studies over time courses ranging from hours to days. This protocol describes step-by-step how to prepare the chip and how to utilize the chip for live imaging of neuronal events in 3rd
instar larvae. These events include the rapid transport of organelles in axons, calcium responses to injury, and time-lapse studies of the trafficking of photo-convertible proteins over long distances and time scales. Another application of the chip is to study regenerative and degenerative responses to axonal injury, so the second part of this protocol describes a new and simple procedure for injuring axons within peripheral nerves by a segmental nerve crush.
Bioengineering, Issue 84, Drosophila melanogaster, Live Imaging, Microfluidics, axonal injury, axonal degeneration, calcium imaging, photoconversion, laser microsurgery
Measurement of Lifespan in Drosophila melanogaster
Institutions: University of Michigan , University of Michigan .
Aging is a phenomenon that results in steady physiological deterioration in nearly all organisms in which it has been examined, leading to reduced physical performance and increased risk of disease. Individual aging is manifest at the population level as an increase in age-dependent mortality, which is often measured in the laboratory by observing lifespan in large cohorts of age-matched individuals. Experiments that seek to quantify the extent to which genetic or environmental manipulations impact lifespan in simple model organisms have been remarkably successful for understanding the aspects of aging that are conserved across taxa and for inspiring new strategies for extending lifespan and preventing age-associated disease in mammals.
The vinegar fly, Drosophila melanogaster
, is an attractive model organism for studying the mechanisms of aging due to its relatively short lifespan, convenient husbandry, and facile genetics. However, demographic measures of aging, including age-specific survival and mortality, are extraordinarily susceptible to even minor variations in experimental design and environment, and the maintenance of strict laboratory practices for the duration of aging experiments is required. These considerations, together with the need to practice careful control of genetic background, are essential for generating robust measurements. Indeed, there are many notable controversies surrounding inference from longevity experiments in yeast, worms, flies and mice that have been traced to environmental or genetic artifacts1-4
. In this protocol, we describe a set of procedures that have been optimized over many years of measuring longevity in Drosophila
using laboratory vials. We also describe the use of the dLife software, which was developed by our laboratory and is available for download (https://sitemaker.umich.edu/pletcherlab/software). dLife accelerates throughput and promotes good practices by incorporating optimal experimental design, simplifying fly handling and data collection, and standardizing data analysis. We will also discuss the many potential pitfalls in the design, collection, and interpretation of lifespan data, and we provide steps to avoid these dangers.
Developmental Biology, Issue 71, Cellular Biology, Molecular Biology, Anatomy, Physiology, Entomology, longevity, lifespan, aging, Drosophila melanogaster, fruit fly, Drosophila, mortality, animal model
Whole Mount in Situ Hybridization of E8.5 to E11.5 Mouse Embryos
Institutions: University of Georgia.
Whole mount in situ
hybridization is a very informative approach for defining gene expression patterns in embryos. The in situ
hybridization procedures are lengthy and technically demanding with multiple important steps that collectively contribute to the quality of the final result. This protocol describes in detail several key quality control steps for optimizing probe labeling and performance.
Overall, our protocol provides a detailed description of the critical steps necessary to reproducibly obtain high quality results. First, we describe the generation of digoxygenin (DIG) labeled RNA probes via in vitro
transcription of DNA templates generated by PCR. We describe three critical quality control assays to determine the amount, integrity and specific activity of the DIG-labeled probes. These steps are important for generating a probe of sufficient sensitivity to detect endogenous mRNAs in a whole mouse embryo. In addition, we describe methods for the fixation and storage of E8.5-E11.5 day old mouse embryos for in situ
hybridization. Then, we describe detailed methods for limited proteinase K digestion of the rehydrated embryos followed by the details of the hybridization conditions, post-hybridization washes and RNase treatment to remove non-specific probe hybridization. An AP-conjugated antibody is used to visualize the labeled probe and reveal the expression pattern of the endogenous transcript. Representative results are shown from successful experiments and typical suboptimal experiments.
Developmental Biology, Issue 56, transcriptome, in situ hybridization, mouse embryo, gene expression, transcripts, mRNA, in vitro transcription, riboprobe
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
An Affordable HIV-1 Drug Resistance Monitoring Method for Resource Limited Settings
Institutions: University of KwaZulu-Natal, Durban, South Africa, Jembi Health Systems, University of Amsterdam, Stanford Medical School.
HIV-1 drug resistance has the potential to seriously compromise the effectiveness and impact of antiretroviral therapy (ART). As ART programs in sub-Saharan Africa continue to expand, individuals on ART should be closely monitored for the emergence of drug resistance. Surveillance of transmitted drug resistance to track transmission of viral strains already resistant to ART is also critical. Unfortunately, drug resistance testing is still not readily accessible in resource limited settings, because genotyping is expensive and requires sophisticated laboratory and data management infrastructure. An open access genotypic drug resistance monitoring method to manage individuals and assess transmitted drug resistance is described. The method uses free open source software for the interpretation of drug resistance patterns and the generation of individual patient reports. The genotyping protocol has an amplification rate of greater than 95% for plasma samples with a viral load >1,000 HIV-1 RNA copies/ml. The sensitivity decreases significantly for viral loads <1,000 HIV-1 RNA copies/ml. The method described here was validated against a method of HIV-1 drug resistance testing approved by the United States Food and Drug Administration (FDA), the Viroseq genotyping method. Limitations of the method described here include the fact that it is not automated and that it also failed to amplify the circulating recombinant form CRF02_AG from a validation panel of samples, although it amplified subtypes A and B from the same panel.
Medicine, Issue 85, Biomedical Technology, HIV-1, HIV Infections, Viremia, Nucleic Acids, genetics, antiretroviral therapy, drug resistance, genotyping, affordable
A Hybrid DNA Extraction Method for the Qualitative and Quantitative Assessment of Bacterial Communities from Poultry Production Samples
Institutions: USDA-Agricultural Research Service, USDA-Agricultural Research Service, Oregon State University, University of Georgia, Northern Arizona University.
The efficacy of DNA extraction protocols can be highly dependent upon both the type of sample being investigated and the types of downstream analyses performed. Considering that the use of new bacterial community analysis techniques (e.g.,
microbiomics, metagenomics) is becoming more prevalent in the agricultural and environmental sciences and many environmental samples within these disciplines can be physiochemically and microbiologically unique (e.g.,
fecal and litter/bedding samples from the poultry production spectrum), appropriate and effective DNA extraction methods need to be carefully chosen. Therefore, a novel semi-automated hybrid DNA extraction method was developed specifically for use with environmental poultry production samples. This method is a combination of the two major types of DNA extraction: mechanical and enzymatic. A two-step intense mechanical homogenization step (using bead-beating specifically formulated for environmental samples) was added to the beginning of the “gold standard” enzymatic DNA extraction method for fecal samples to enhance the removal of bacteria and DNA from the sample matrix and improve the recovery of Gram-positive bacterial community members. Once the enzymatic extraction portion of the hybrid method was initiated, the remaining purification process was automated using a robotic workstation to increase sample throughput and decrease sample processing error. In comparison to the strict mechanical and enzymatic DNA extraction methods, this novel hybrid method provided the best overall combined performance when considering quantitative (using 16S rRNA qPCR) and qualitative (using microbiomics) estimates of the total bacterial communities when processing poultry feces and litter samples.
Molecular Biology, Issue 94, DNA extraction, poultry, environmental, feces, litter, semi-automated, microbiomics, qPCR
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Efficient Agroinfiltration of Plants for High-level Transient Expression of Recombinant Proteins
Institutions: Arizona State University .
Mammalian cell culture is the major platform for commercial production of human vaccines and therapeutic proteins. However, it cannot meet the increasing worldwide demand for pharmaceuticals due to its limited scalability and high cost. Plants have shown to be one of the most promising alternative pharmaceutical production platforms that are robust, scalable, low-cost and safe. The recent development of virus-based vectors has allowed rapid and high-level transient expression of recombinant proteins in plants. To further optimize the utility of the transient expression system, we demonstrate a simple, efficient and scalable methodology to introduce target-gene containing Agrobacterium
into plant tissue in this study. Our results indicate that agroinfiltration with both syringe and vacuum methods have resulted in the efficient introduction of Agrobacterium
into leaves and robust production of two fluorescent proteins; GFP and DsRed. Furthermore, we demonstrate the unique advantages offered by both methods. Syringe infiltration is simple and does not need expensive equipment. It also allows the flexibility to either infiltrate the entire leave with one target gene, or to introduce genes of multiple targets on one leaf. Thus, it can be used for laboratory scale expression of recombinant proteins as well as for comparing different proteins or vectors for yield or expression kinetics. The simplicity of syringe infiltration also suggests its utility in high school and college education for the subject of biotechnology. In contrast, vacuum infiltration is more robust and can be scaled-up for commercial manufacture of pharmaceutical proteins. It also offers the advantage of being able to agroinfiltrate plant species that are not amenable for syringe infiltration such as lettuce and Arabidopsis
. Overall, the combination of syringe and vacuum agroinfiltration provides researchers and educators a simple, efficient, and robust methodology for transient protein expression. It will greatly facilitate the development of pharmaceutical proteins and promote science education.
Plant Biology, Issue 77, Genetics, Molecular Biology, Cellular Biology, Virology, Microbiology, Bioengineering, Plant Viruses, Antibodies, Monoclonal, Green Fluorescent Proteins, Plant Proteins, Recombinant Proteins, Vaccines, Synthetic, Virus-Like Particle, Gene Transfer Techniques, Gene Expression, Agroinfiltration, plant infiltration, plant-made pharmaceuticals, syringe agroinfiltration, vacuum agroinfiltration, monoclonal antibody, Agrobacterium tumefaciens, Nicotiana benthamiana, GFP, DsRed, geminiviral vectors, imaging, plant model
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Identification of Metabolically Active Bacteria in the Gut of the Generalist Spodoptera littoralis via DNA Stable Isotope Probing Using 13C-Glucose
Institutions: Max Planck Institute for Chemical Ecology.
Guts of most insects are inhabited by complex communities of symbiotic nonpathogenic bacteria. Within such microbial communities it is possible to identify commensal or mutualistic bacteria species. The latter ones, have been observed to serve multiple functions to the insect, i.e.
helping in insect reproduction1
, boosting the immune response2
, pheromone production3
, as well as nutrition, including the synthesis of essential amino acids4,
Due to the importance of these associations, many efforts have been made to characterize the communities down to the individual members. However, most of these efforts were either based on cultivation methods or relied on the generation of 16S rRNA gene fragments which were sequenced for final identification. Unfortunately, these approaches only identified the bacterial species present in the gut and provided no information on the metabolic activity of the microorganisms.
To characterize the metabolically active bacterial species in the gut of an insect, we used stable isotope probing (SIP) in vivo
C-glucose as a universal substrate. This is a promising culture-free technique that allows the linkage of microbial phylogenies to their particular metabolic activity. This is possible by tracking stable, isotope labeled atoms from substrates into microbial biomarkers, such as DNA and RNA5
. The incorporation of 13
C isotopes into DNA increases the density of the labeled DNA compared to the unlabeled (12
C) one. In the end, the 13
C-labeled DNA or RNA is separated by density-gradient ultracentrifugation from the 12
C-unlabeled similar one6
. Subsequent molecular analysis of the separated nucleic acid isotopomers provides the connection between metabolic activity and identity of the species.
Here, we present the protocol used to characterize the metabolically active bacteria in the gut of a generalist insect (our model system), Spodoptera littoralis
). The phylogenetic analysis of the DNA was done using pyrosequencing, which allowed high resolution and precision in the identification of insect gut bacterial community. As main substrate, 13
C-labeled glucose was used in the experiments. The substrate was fed to the insects using an artificial diet.
Microbiology, Issue 81, Insects, Sequence Analysis, Genetics, Microbial, Bacteria, Lepidoptera, Spodoptera littoralis, stable-isotope-probing (SIP), pyro-sequencing, 13C-glucose, gut, microbiota, bacteria
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Use of Arabidopsis eceriferum Mutants to Explore Plant Cuticle Biosynthesis
Institutions: University of British Columbia - UBC, University of British Columbia - UBC.
The plant cuticle is a waxy outer covering on plants that has a primary role in water conservation, but is also an important barrier against the entry of pathogenic microorganisms. The cuticle is made up of a tough crosslinked polymer called "cutin" and a protective wax layer that seals the plant surface. The waxy layer of the cuticle is obvious on many plants, appearing as a shiny film on the ivy leaf or as a dusty outer covering on the surface of a grape or a cabbage leaf thanks to light scattering crystals present in the wax. Because the cuticle is an essential adaptation of plants to a terrestrial environment, understanding the genes involved in plant cuticle formation has applications in both agriculture and forestry. Today, we'll show the analysis of plant cuticle mutants identified by forward and reverse genetics approaches.
Plant Biology, Issue 16, Annual Review, Cuticle, Arabidopsis, Eceriferum Mutants, Cryso-SEM, Gas Chromatography
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Choice and No-Choice Assays for Testing the Resistance of A. thaliana to Chewing Insects
Institutions: Cornell University.
Larvae of the small white cabbage butterfly are a pest in agricultural settings. This caterpillar species feeds from plants in the cabbage family, which include many crops such as cabbage, broccoli, Brussel sprouts etc. Rearing of the insects takes place on cabbage plants in the greenhouse. At least two cages are needed for the rearing of Pieris rapae. One for the larvae and the other to contain the adults, the butterflies. In order to investigate the role of plant hormones and toxic plant chemicals in resistance to this insect pest, we demonstrate two experiments. First, determination of the role of jasmonic acid (JA - a plant hormone often indicated in resistance to insects) in resistance to the chewing insect Pieris rapae. Caterpillar growth can be compared on wild-type and mutant plants impaired in production of JA. This experiment is considered "No Choice", because larvae are forced to subsist on a single plant which synthesizes or is deficient in JA. Second, we demonstrate an experiment that investigates the role of glucosinolates, which are used as oviposition (egg-laying) signals. Here, we use WT and mutant Arabidopsis impaired in glucosinolate production in a "Choice" experiment in which female butterflies are allowed to choose to lay their eggs on plants of either genotype. This video demonstrates the experimental setup for both assays as well as representative results.
Plant Biology, Issue 15, Annual Review, Plant Resistance, Herbivory, Arabidopsis thaliana, Pieris rapae, Caterpillars, Butterflies, Jasmonic Acid, Glucosinolates