Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
20 Related JoVE Articles!
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
A Rapid and Efficient Method for Assessing Pathogenicity of Ustilago maydis on Maize and Teosinte Lines
Institutions: University of Georgia.
Maize is a major cereal crop worldwide. However, susceptibility to biotrophic pathogens is the primary constraint to increasing productivity. U. maydis
is a biotrophic fungal pathogen and the causal agent of corn smut on maize. This disease is responsible for significant yield losses of approximately $1.0 billion annually in the U.S.1
Several methods including crop rotation, fungicide application and seed treatments are currently used to control corn smut2
. However, host resistance is the only practical method for managing corn smut. Identification of crop plants including maize, wheat, and rice that are resistant to various biotrophic pathogens has significantly decreased yield losses annually3-5
. Therefore, the use of a pathogen inoculation method that efficiently and reproducibly delivers the pathogen in between the plant leaves, would facilitate the rapid identification of maize lines that are resistant to U. maydis
. As, a first step toward indentifying maize lines that are resistant to U. maydis
, a needle injection inoculation method and a resistance reaction screening method was utilized to inoculate maize, teosinte, and maize x teosinte introgression lines with a U. maydis
strain and to select resistant plants.
Maize, teosinte and maize x teosinte introgression lines, consisting of about 700 plants, were planted, inoculated with a strain of U. maydis
, and screened for resistance. The inoculation and screening methods successfully identified three teosinte lines resistant to U. maydis
. Here a detailed needle injection inoculation and resistance reaction screening protocol for maize, teosinte, and maize x teosinte introgression lines is presented. This study demonstrates that needle injection inoculation is an invaluable tool in agriculture that can efficiently deliver U. maydis
in between the plant leaves and has provided plant lines that are resistant to U. maydis
that can now be combined and tested in breeding programs for improved disease resistance.
Environmental Sciences, Issue 83, Bacterial Infections, Signs and Symptoms, Eukaryota, Plant Physiological Phenomena, Ustilago maydis, needle injection inoculation, disease rating scale, plant-pathogen interactions
A Toolkit to Enable Hydrocarbon Conversion in Aqueous Environments
Institutions: Delft University of Technology, Delft University of Technology.
This work puts forward a toolkit that enables the conversion of alkanes by Escherichia coli
and presents a proof of principle of its applicability. The toolkit consists of multiple standard interchangeable parts (BioBricks)9
addressing the conversion of alkanes, regulation of gene expression and survival in toxic hydrocarbon-rich environments.
A three-step pathway for alkane degradation was implemented in E. coli
to enable the conversion of medium- and long-chain alkanes to their respective alkanols, alkanals and ultimately alkanoic-acids. The latter were metabolized via the native β-oxidation pathway. To facilitate the oxidation of medium-chain alkanes (C5-C13) and cycloalkanes (C5-C8), four genes (alkB2
) of the alkane hydroxylase system from Gordonia
were transformed into E. coli
. For the conversion of long-chain alkanes (C15-C36), theladA
gene from Geobacillus thermodenitrificans
was implemented. For the required further steps of the degradation process, ADH
and ALDH (
originating from G. thermodenitrificans
) were introduced10,11
. The activity was measured by resting cell assays. For each oxidative step, enzyme activity was observed.
To optimize the process efficiency, the expression was only induced under low glucose conditions: a substrate-regulated promoter, pCaiF, was used. pCaiF is present in E. coli
K12 and regulates the expression of the genes involved in the degradation of non-glucose carbon sources.
The last part of the toolkit - targeting survival - was implemented using solvent tolerance genes, PhPFDα and β, both from Pyrococcus horikoshii
OT3. Organic solvents can induce cell stress and decreased survivability by negatively affecting protein folding. As chaperones, PhPFDα and β improve the protein folding process e.g.
under the presence of alkanes. The expression of these genes led to an improved hydrocarbon tolerance shown by an increased growth rate (up to 50%) in the presences of 10% n
-hexane in the culture medium were observed.
Summarizing, the results indicate that the toolkit enables E. coli
to convert and tolerate hydrocarbons in aqueous environments. As such, it represents an initial step towards a sustainable solution for oil-remediation using a synthetic biology approach.
Bioengineering, Issue 68, Microbiology, Biochemistry, Chemistry, Chemical Engineering, Oil remediation, alkane metabolism, alkane hydroxylase system, resting cell assay, prefoldin, Escherichia coli, synthetic biology, homologous interaction mapping, mathematical model, BioBrick, iGEM
Isolation of Native Soil Microorganisms with Potential for Breaking Down Biodegradable Plastic Mulch Films Used in Agriculture
Institutions: Western Washington University, Washington State University Northwestern Research and Extension Center, Texas Tech University.
Fungi native to agricultural soils that colonized commercially available biodegradable mulch (BDM) films were isolated and assessed for potential to degrade plastics. Typically, when formulations of plastics are known and a source of the feedstock is available, powdered plastic can be suspended in agar-based media and degradation determined by visualization of clearing zones. However, this approach poorly mimics in situ
degradation of BDMs. First, BDMs are not dispersed as small particles throughout the soil matrix. Secondly, BDMs are not sold commercially as pure polymers, but rather as films containing additives (e.g.
fillers, plasticizers and dyes) that may affect microbial growth. The procedures described herein were used for isolates acquired from soil-buried mulch films. Fungal isolates acquired from excavated BDMs were tested individually for growth on pieces of new, disinfested BDMs laid atop defined medium containing no carbon source except agar. Isolates that grew on BDMs were further tested in liquid medium where BDMs were the sole added carbon source. After approximately ten weeks, fungal colonization and BDM degradation were assessed by scanning electron microscopy. Isolates were identified via analysis of ribosomal RNA gene sequences. This report describes methods for fungal isolation, but bacteria also were isolated using these methods by substituting media appropriate for bacteria. Our methodology should prove useful for studies investigating breakdown of intact plastic films or products for which plastic feedstocks are either unknown or not available. However our approach does not provide a quantitative method for comparing rates of BDM degradation.
Microbiology, Issue 75, Plant Biology, Environmental Sciences, Agricultural Sciences, Soil Science, Molecular Biology, Cellular Biology, Genetics, Mycology, Fungi, Bacteria, Microorganisms, Biodegradable plastic, biodegradable mulch, compostable plastic, compostable mulch, plastic degradation, composting, breakdown, soil, 18S ribosomal DNA, isolation, culture
Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence
Institutions: University of Edinburgh, HistoRx Inc..
Morphologic heterogeneity within an individual tumor is well-recognized by histopathologists in surgical practice. While this often takes the form of areas of distinct differentiation into recognized histological subtypes, or different pathological grade, often there are more subtle differences in phenotype which defy accurate classification (Figure 1). Ultimately, since morphology is dictated by the underlying molecular phenotype, areas with visible differences are likely to be accompanied by differences in the expression of proteins which orchestrate cellular function and behavior, and therefore, appearance. The significance of visible and invisible (molecular) heterogeneity for prognosis is unknown, but recent evidence suggests that, at least at the genetic level, heterogeneity exists in the primary tumor1,2
, and some of these sub-clones give rise to metastatic (and therefore lethal) disease.
Moreover, some proteins are measured as biomarkers because they are the targets of therapy (for instance ER and HER2 for tamoxifen and trastuzumab (Herceptin), respectively). If these proteins show variable expression within a tumor then therapeutic responses may also be variable. The widely used histopathologic scoring schemes for immunohistochemistry either ignore, or numerically homogenize the quantification of protein expression. Similarly, in destructive techniques, where the tumor samples are homogenized (such as gene expression profiling), quantitative information can be elucidated, but spatial information is lost. Genetic heterogeneity mapping approaches in pancreatic cancer have relied either on generation of a single cell suspension3
, or on macrodissection4
. A recent study has used quantum dots in order to map morphologic and molecular heterogeneity in prostate cancer tissue5
, providing proof of principle that morphology and molecular mapping is feasible, but falling short of quantifying the heterogeneity. Since immunohistochemistry is, at best, only semi-quantitative and subject to intra- and inter-observer bias, more sensitive and quantitative methodologies are required in order to accurately map and quantify tissue heterogeneity in situ
We have developed and applied an experimental and statistical methodology in order to systematically quantify the heterogeneity of protein expression in whole tissue sections of tumors, based on the Automated QUantitative Analysis (AQUA) system6
. Tissue sections are labeled with specific antibodies directed against cytokeratins and targets of interest, coupled to fluorophore-labeled secondary antibodies. Slides are imaged using a whole-slide fluorescence scanner. Images are subdivided into hundreds to thousands of tiles, and each tile is then assigned an AQUA score which is a measure of protein concentration within the epithelial (tumor) component of the tissue. Heatmaps are generated to represent tissue expression of the proteins and a heterogeneity score assigned, using a statistical measure of heterogeneity originally used in ecology, based on the Simpson's biodiversity index7
To date there have been no attempts to systematically map and quantify this variability in tandem with protein expression, in histological preparations. Here, we illustrate the first use of the method applied to ER and HER2 biomarker expression in ovarian cancer. Using this method paves the way for analyzing heterogeneity as an independent variable in studies of biomarker expression in translational studies, in order to establish the significance of heterogeneity in prognosis and prediction of responses to therapy.
Medicine, Issue 56, quantitative immunofluorescence, heterogeneity, cancer, biomarker, targeted therapy, immunohistochemistry, proteomics, histopathology
Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons
Institutions: National Research Council Canada.
One of the major questions in microbial ecology is “who is there?” This question can be answered using various tools, but one of the long-lasting gold standards is to sequence 16S ribosomal RNA (rRNA) gene amplicons generated by domain-level PCR reactions amplifying from genomic DNA. Traditionally, this was performed by cloning and Sanger (capillary electrophoresis) sequencing of PCR amplicons. The advent of next-generation sequencing has tremendously simplified and increased the sequencing depth for 16S rRNA gene sequencing. The introduction of benchtop sequencers now allows small labs to perform their 16S rRNA sequencing in-house in a matter of days. Here, an approach for 16S rRNA gene amplicon sequencing using a benchtop next-generation sequencer is detailed. The environmental DNA is first amplified by PCR using primers that contain sequencing adapters and barcodes. They are then coupled to spherical particles via emulsion PCR. The particles are loaded on a disposable chip and the chip is inserted in the sequencing machine after which the sequencing is performed. The sequences are retrieved in fastq format, filtered and the barcodes are used to establish the sample membership of the reads. The filtered and binned reads are then further analyzed using publically available tools. An example analysis where the reads were classified with a taxonomy-finding algorithm within the software package Mothur is given. The method outlined here is simple, inexpensive and straightforward and should help smaller labs to take advantage from the ongoing genomic revolution.
Molecular Biology, Issue 90, Metagenomics, Bacteria, 16S ribosomal RNA gene, Amplicon sequencing, Next-generation sequencing, benchtop sequencers
Extracting DNA from the Gut Microbes of the Termite (Zootermopsis Angusticollis) and Visualizing Gut Microbes
Institutions: California Institute of Technology - Caltech.
Termites are among the few animals known to have the capacity to subsist solely by consuming wood. The termite gut tract contains a dense and species-rich microbial population that assists in the degradation of lignocellulose predominantly into acetate, the key nutrient fueling termite metabolism (Odelson & Breznak, 1983). Within these microbial populations are bacteria, methanogenic archaea and, in some ("lower") termites, eukaryotic protozoa. Thus, termites are excellent research subjects for studying the interactions among microbial species and the numerous biochemical functions they perform to the benefit of their host. The species composition of microbial populations in termite guts as well as key genes involved in various biochemical processes has been explored using molecular techniques (Kudo et al., 1998; Schmit-Wagner et al., 2003; Salmassi & Leadbetter, 2003). These techniques depend on the extraction and purification of high-quality nucleic acids from the termite gut environment. The extraction technique described in this video is a modified compilation of protocols developed for extraction and purification of nucleic acids from environmental samples (Mor et al., 1994; Berthelet et al., 1996; Purdy et al., 1996; Salmassi & Leadbetter, 2003; Ottesen et al. 2006) and it produces DNA from termite hindgut material suitable for use as template for polymerase chain reaction (PCR).
Microbiology, issue 4, microbial community, DNA, extraction, gut, termite
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Using Coculture to Detect Chemically Mediated Interspecies Interactions
Institutions: University of North Carolina at Chapel Hill .
In nature, bacteria rarely exist in isolation; they are instead surrounded by a diverse array of other microorganisms that alter the local environment by secreting metabolites. These metabolites have the potential to modulate the physiology and differentiation of their microbial neighbors and are likely important factors in the establishment and maintenance of complex microbial communities. We have developed a fluorescence-based coculture screen to identify such chemically mediated microbial interactions. The screen involves combining a fluorescent transcriptional reporter strain with environmental microbes on solid media and allowing the colonies to grow in coculture. The fluorescent transcriptional reporter is designed so that the chosen bacterial strain fluoresces when it is expressing a particular phenotype of interest (i.e.
biofilm formation, sporulation, virulence factor production, etc
.) Screening is performed under growth conditions where this phenotype is not
expressed (and therefore the reporter strain is typically nonfluorescent). When an environmental microbe secretes a metabolite that activates this phenotype, it diffuses through the agar and activates the fluorescent reporter construct. This allows the inducing-metabolite-producing microbe to be detected: they are the nonfluorescent colonies most proximal to the fluorescent colonies. Thus, this screen allows the identification of environmental microbes that produce diffusible metabolites that activate a particular physiological response in a reporter strain. This publication discusses how to: a) select appropriate coculture screening conditions, b) prepare the reporter and environmental microbes for screening, c) perform the coculture screen, d) isolate putative inducing organisms, and e) confirm their activity in a secondary screen. We developed this method to screen for soil organisms that activate biofilm matrix-production in Bacillus subtilis
; however, we also discuss considerations for applying this approach to other genetically tractable bacteria.
Microbiology, Issue 80, High-Throughput Screening Assays, Genes, Reporter, Microbial Interactions, Soil Microbiology, Coculture, microbial interactions, screen, fluorescent transcriptional reporters, Bacillus subtilis
Identification of Metabolically Active Bacteria in the Gut of the Generalist Spodoptera littoralis via DNA Stable Isotope Probing Using 13C-Glucose
Institutions: Max Planck Institute for Chemical Ecology.
Guts of most insects are inhabited by complex communities of symbiotic nonpathogenic bacteria. Within such microbial communities it is possible to identify commensal or mutualistic bacteria species. The latter ones, have been observed to serve multiple functions to the insect, i.e.
helping in insect reproduction1
, boosting the immune response2
, pheromone production3
, as well as nutrition, including the synthesis of essential amino acids4,
Due to the importance of these associations, many efforts have been made to characterize the communities down to the individual members. However, most of these efforts were either based on cultivation methods or relied on the generation of 16S rRNA gene fragments which were sequenced for final identification. Unfortunately, these approaches only identified the bacterial species present in the gut and provided no information on the metabolic activity of the microorganisms.
To characterize the metabolically active bacterial species in the gut of an insect, we used stable isotope probing (SIP) in vivo
C-glucose as a universal substrate. This is a promising culture-free technique that allows the linkage of microbial phylogenies to their particular metabolic activity. This is possible by tracking stable, isotope labeled atoms from substrates into microbial biomarkers, such as DNA and RNA5
. The incorporation of 13
C isotopes into DNA increases the density of the labeled DNA compared to the unlabeled (12
C) one. In the end, the 13
C-labeled DNA or RNA is separated by density-gradient ultracentrifugation from the 12
C-unlabeled similar one6
. Subsequent molecular analysis of the separated nucleic acid isotopomers provides the connection between metabolic activity and identity of the species.
Here, we present the protocol used to characterize the metabolically active bacteria in the gut of a generalist insect (our model system), Spodoptera littoralis
). The phylogenetic analysis of the DNA was done using pyrosequencing, which allowed high resolution and precision in the identification of insect gut bacterial community. As main substrate, 13
C-labeled glucose was used in the experiments. The substrate was fed to the insects using an artificial diet.
Microbiology, Issue 81, Insects, Sequence Analysis, Genetics, Microbial, Bacteria, Lepidoptera, Spodoptera littoralis, stable-isotope-probing (SIP), pyro-sequencing, 13C-glucose, gut, microbiota, bacteria
A PCR-based Genotyping Method to Distinguish Between Wild-type and Ornamental Varieties of Imperata cylindrica
Institutions: The University of Alabama, Huntsville, Center for Plant Health Science and Technology.
Wild-type I. cylindrica
(cogongrass) is one of the top ten worst invasive plants in the world, negatively impacting agricultural and natural resources in 73 different countries throughout Africa, Asia, Europe, New Zealand, Oceania and the Americas1-2
. Cogongrass forms rapidly-spreading, monodominant stands that displace a large variety of native plant species and in turn threaten the native animals that depend on the displaced native plant species for forage and shelter. To add to the problem, an ornamental variety [I. cylindrica
(Retzius)] is widely marketed under the names of Imperata cylindrica
'Rubra', Red Baron, and Japanese blood grass (JBG). This variety is putatively sterile and noninvasive and is considered a desirable ornamental for its red-colored leaves. However, under the correct conditions, JBG can produce viable seed (Carol Holko, 2009 personal communication) and can revert to a green invasive form that is often indistinguishable from cogongrass as it takes on the distinguishing characteristics of the wild-type invasive variety4
). This makes identification using morphology a difficult task even for well-trained plant taxonomists. Reversion of JBG to an aggressive green phenotype is also not a rare occurrence. Using sequence comparisons of coding and variable regions in both nuclear and chloroplast DNA, we have confirmed that JBG has reverted to the green invasive within the states of Maryland, South Carolina, and Missouri. JBG has been sold and planted in just about every state in the continental U.S. where there is not an active cogongrass infestation. The extent of the revert problem in not well understood because reverted plants are undocumented and often destroyed.
Application of this molecular protocol provides a method to identify JBG reverts and can help keep these varieties from co-occurring and possibly hybridizing. Cogongrass is an obligate outcrosser and, when crossed with a different genotype, can produce viable wind-dispersed seeds that spread cogongrass over wide distances5-7
. JBG has a slightly different genotype than cogongrass and may be able to form viable hybrids with cogongrass. To add to the problem, JBG is more cold and shade tolerant than cogongrass8-10
, and gene flow between these two varieties is likely to generate hybrids that are more aggressive, shade tolerant, and cold hardy than wild-type cogongrass. While wild-type cogongrass currently infests over 490 million hectares worldwide, in the Southeast U.S. it infests over 500,000 hectares and is capable of occupying most of the U.S. as it rapidly spreads northward due to its broad niche and geographic potential3,7,11
. The potential of a genetic crossing is a serious concern for the USDA-APHIS Federal Noxious Week Program. Currently, the USDA-APHIS prohibits JBG in states where there are major cogongrass infestations (e.g., Florida, Alabama, Mississippi). However, preventing the two varieties from combining can prove more difficult as cogongrass and JBG expand their distributions. Furthermore, the distribution of the JBG revert is currently unknown and without the ability to identify these varieties through morphology, some cogongrass infestations may be the result of JBG reverts. Unfortunately, current molecular methods of identification typically rely on AFLP (Amplified Fragment Length Polymorphisms) and DNA sequencing, both of which are time consuming and costly. Here, we present the first cost-effective and reliable PCR-based molecular genotyping method to accurately distinguish between cogongrass and JBG revert.
Molecular Biology, Issue 60, Molecular genotyping, Japanese blood grass, Red Baron, cogongrass, invasive plants
An Affordable HIV-1 Drug Resistance Monitoring Method for Resource Limited Settings
Institutions: University of KwaZulu-Natal, Durban, South Africa, Jembi Health Systems, University of Amsterdam, Stanford Medical School.
HIV-1 drug resistance has the potential to seriously compromise the effectiveness and impact of antiretroviral therapy (ART). As ART programs in sub-Saharan Africa continue to expand, individuals on ART should be closely monitored for the emergence of drug resistance. Surveillance of transmitted drug resistance to track transmission of viral strains already resistant to ART is also critical. Unfortunately, drug resistance testing is still not readily accessible in resource limited settings, because genotyping is expensive and requires sophisticated laboratory and data management infrastructure. An open access genotypic drug resistance monitoring method to manage individuals and assess transmitted drug resistance is described. The method uses free open source software for the interpretation of drug resistance patterns and the generation of individual patient reports. The genotyping protocol has an amplification rate of greater than 95% for plasma samples with a viral load >1,000 HIV-1 RNA copies/ml. The sensitivity decreases significantly for viral loads <1,000 HIV-1 RNA copies/ml. The method described here was validated against a method of HIV-1 drug resistance testing approved by the United States Food and Drug Administration (FDA), the Viroseq genotyping method. Limitations of the method described here include the fact that it is not automated and that it also failed to amplify the circulating recombinant form CRF02_AG from a validation panel of samples, although it amplified subtypes A and B from the same panel.
Medicine, Issue 85, Biomedical Technology, HIV-1, HIV Infections, Viremia, Nucleic Acids, genetics, antiretroviral therapy, drug resistance, genotyping, affordable
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Quantitative and Automated High-throughput Genome-wide RNAi Screens in C. elegans
Institutions: Université de la Méditerranée.
RNA interference is a powerful method to understand gene function, especially when conducted at a whole-genome scale and in a quantitative context. In C. elegans
, gene function can be knocked down simply and efficiently by feeding worms with bacteria expressing a dsRNA corresponding to a specific gene 1
. While the creation of libraries of RNAi clones covering most of the C. elegans
opened the way for true functional genomic studies (see for example 4-7
), most established methods are laborious. Moy and colleagues have developed semi-automated protocols that facilitate genome-wide screens 8
. The approach relies on microscopic imaging and image analysis.
Here we describe an alternative protocol for a high-throughput genome-wide screen, based on robotic handling of bacterial RNAi clones, quantitative analysis using the COPAS Biosort (Union Biometrica (UBI)), and an integrated software: the MBioLIMS (Laboratory Information Management System from Modul-Bio) a technology that provides increased throughput for data management and sample tracking. The method allows screens to be conducted on solid medium plates. This is particularly important for some studies, such as those addressing host-pathogen interactions in C. elegans
, since certain microbes do not efficiently infect worms in liquid culture.
We show how the method can be used to quantify the importance of genes in anti-fungal innate immunity in C. elegans
. In this case, the approach relies on the use of a transgenic strain carrying an epidermal infection-inducible fluorescent reporter gene, with GFP under the control of the promoter of the antimicrobial peptide gene nlp 29
and a red fluorescent reporter that is expressed constitutively in the epidermis. The latter provides an internal control for the functional integrity of the epidermis and nonspecific transgene silencing9
. When control worms are infected by the fungus they fluoresce green. Knocking down by RNAi a gene required for nlp 29
expression results in diminished fluorescence after infection. Currently, this protocol allows more than 3,000 RNAi clones to be tested and analyzed per week, opening the possibility of screening the entire genome in less than 2 months.
Molecular Biology, Issue 60, C. elegans, fluorescent reporter, Biosort, LIMS, innate immunity, Drechmeria coniospora
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Principles of Site-Specific Recombinase (SSR) Technology
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Site-specific recombinase (SSR) technology allows the manipulation of gene structure to explore gene function and has become an integral tool of molecular biology. Site-specific recombinases are proteins that bind to distinct DNA target sequences. The Cre/lox system was first described in bacteriophages during the 1980's. Cre recombinase is a Type I topoisomerase that catalyzes site-specific recombination of DNA between two loxP (locus of X-over P1) sites. The Cre/lox system does not require any cofactors. LoxP sequences contain distinct binding sites for Cre recombinases that surround a directional core sequence where recombination and rearrangement takes place. When cells contain loxP sites and express the Cre recombinase, a recombination event occurs. Double-stranded DNA is cut at both loxP sites by the Cre recombinase, rearranged, and ligated ("scissors and glue"). Products of the recombination event depend on the relative orientation of the asymmetric sequences.
SSR technology is frequently used as a tool to explore gene function. Here the gene of interest is flanked with Cre target sites loxP ("floxed"). Animals are then crossed with animals expressing the Cre recombinase under the control of a tissue-specific promoter. In tissues that express the Cre recombinase it binds to target sequences and excises the floxed gene. Controlled gene deletion allows the investigation of gene function in specific tissues and at distinct time points. Analysis of gene function employing SSR technology --- conditional mutagenesis -- has significant advantages over traditional knock-outs where gene deletion is frequently lethal.
Cellular Biology, Issue 15, Molecular Biology, Site-Specific Recombinase, Cre recombinase, Cre/lox system, transgenic animals, transgenic technology
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution