One of the major questions in microbial ecology is “who is there?” This question can be answered using various tools, but one of the long-lasting gold standards is to sequence 16S ribosomal RNA (rRNA) gene amplicons generated by domain-level PCR reactions amplifying from genomic DNA. Traditionally, this was performed by cloning and Sanger (capillary electrophoresis) sequencing of PCR amplicons. The advent of next-generation sequencing has tremendously simplified and increased the sequencing depth for 16S rRNA gene sequencing. The introduction of benchtop sequencers now allows small labs to perform their 16S rRNA sequencing in-house in a matter of days. Here, an approach for 16S rRNA gene amplicon sequencing using a benchtop next-generation sequencer is detailed. The environmental DNA is first amplified by PCR using primers that contain sequencing adapters and barcodes. They are then coupled to spherical particles via emulsion PCR. The particles are loaded on a disposable chip and the chip is inserted in the sequencing machine after which the sequencing is performed. The sequences are retrieved in fastq format, filtered and the barcodes are used to establish the sample membership of the reads. The filtered and binned reads are then further analyzed using publically available tools. An example analysis where the reads were classified with a taxonomy-finding algorithm within the software package Mothur is given. The method outlined here is simple, inexpensive and straightforward and should help smaller labs to take advantage from the ongoing genomic revolution.
21 Related JoVE Articles!
Optimization and Utilization of Agrobacterium-mediated Transient Protein Production in Nicotiana
Institutions: Fraunhofer USA Center for Molecular Biotechnology.
-mediated transient protein production in plants is a promising approach to produce vaccine antigens and therapeutic proteins within a short period of time. However, this technology is only just beginning to be applied to large-scale production as many technological obstacles to scale up are now being overcome. Here, we demonstrate a simple and reproducible method for industrial-scale transient protein production based on vacuum infiltration of Nicotiana
plants with Agrobacteria
carrying launch vectors. Optimization of Agrobacterium
cultivation in AB medium allows direct dilution of the bacterial culture in Milli-Q water, simplifying the infiltration process. Among three tested species of Nicotiana
, N. excelsiana
× N. excelsior
) was selected as the most promising host due to the ease of infiltration, high level of reporter protein production, and about two-fold higher biomass production under controlled environmental conditions. Induction of Agrobacterium
harboring pBID4-GFP (Tobacco mosaic virus
-based) using chemicals such as acetosyringone and monosaccharide had no effect on the protein production level. Infiltrating plant under 50 to 100 mbar for 30 or 60 sec resulted in about 95% infiltration of plant leaf tissues. Infiltration with Agrobacterium
laboratory strain GV3101 showed the highest protein production compared to Agrobacteria
laboratory strains LBA4404 and C58C1 and wild-type Agrobacteria
strains at6, at10, at77 and A4. Co-expression of a viral RNA silencing suppressor, p23 or p19, in N. benthamiana
resulted in earlier accumulation and increased production (15-25%) of target protein (influenza virus hemagglutinin).
Plant Biology, Issue 86, Agroinfiltration, Nicotiana benthamiana, transient protein production, plant-based expression, viral vector, Agrobacteria
Isolation of Native Soil Microorganisms with Potential for Breaking Down Biodegradable Plastic Mulch Films Used in Agriculture
Institutions: Western Washington University, Washington State University Northwestern Research and Extension Center, Texas Tech University.
Fungi native to agricultural soils that colonized commercially available biodegradable mulch (BDM) films were isolated and assessed for potential to degrade plastics. Typically, when formulations of plastics are known and a source of the feedstock is available, powdered plastic can be suspended in agar-based media and degradation determined by visualization of clearing zones. However, this approach poorly mimics in situ
degradation of BDMs. First, BDMs are not dispersed as small particles throughout the soil matrix. Secondly, BDMs are not sold commercially as pure polymers, but rather as films containing additives (e.g.
fillers, plasticizers and dyes) that may affect microbial growth. The procedures described herein were used for isolates acquired from soil-buried mulch films. Fungal isolates acquired from excavated BDMs were tested individually for growth on pieces of new, disinfested BDMs laid atop defined medium containing no carbon source except agar. Isolates that grew on BDMs were further tested in liquid medium where BDMs were the sole added carbon source. After approximately ten weeks, fungal colonization and BDM degradation were assessed by scanning electron microscopy. Isolates were identified via analysis of ribosomal RNA gene sequences. This report describes methods for fungal isolation, but bacteria also were isolated using these methods by substituting media appropriate for bacteria. Our methodology should prove useful for studies investigating breakdown of intact plastic films or products for which plastic feedstocks are either unknown or not available. However our approach does not provide a quantitative method for comparing rates of BDM degradation.
Microbiology, Issue 75, Plant Biology, Environmental Sciences, Agricultural Sciences, Soil Science, Molecular Biology, Cellular Biology, Genetics, Mycology, Fungi, Bacteria, Microorganisms, Biodegradable plastic, biodegradable mulch, compostable plastic, compostable mulch, plastic degradation, composting, breakdown, soil, 18S ribosomal DNA, isolation, culture
Detecting Somatic Genetic Alterations in Tumor Specimens by Exon Capture and Massively Parallel Sequencing
Institutions: Memorial Sloan-Kettering Cancer Center, Memorial Sloan-Kettering Cancer Center.
Efforts to detect and investigate key oncogenic mutations have proven valuable to facilitate the appropriate treatment for cancer patients. The establishment of high-throughput, massively parallel "next-generation" sequencing has aided the discovery of many such mutations. To enhance the clinical and translational utility of this technology, platforms must be high-throughput, cost-effective, and compatible with formalin-fixed paraffin embedded (FFPE) tissue samples that may yield small amounts of degraded or damaged DNA. Here, we describe the preparation of barcoded and multiplexed DNA libraries followed by hybridization-based capture of targeted exons for the detection of cancer-associated mutations in fresh frozen and FFPE tumors by massively parallel sequencing. This method enables the identification of sequence mutations, copy number alterations, and select structural rearrangements involving all targeted genes. Targeted exon sequencing offers the benefits of high throughput, low cost, and deep sequence coverage, thus conferring high sensitivity for detecting low frequency mutations.
Molecular Biology, Issue 80, Molecular Diagnostic Techniques, High-Throughput Nucleotide Sequencing, Genetics, Neoplasms, Diagnosis, Massively parallel sequencing, targeted exon sequencing, hybridization capture, cancer, FFPE, DNA mutations
Identification of Metabolically Active Bacteria in the Gut of the Generalist Spodoptera littoralis via DNA Stable Isotope Probing Using 13C-Glucose
Institutions: Max Planck Institute for Chemical Ecology.
Guts of most insects are inhabited by complex communities of symbiotic nonpathogenic bacteria. Within such microbial communities it is possible to identify commensal or mutualistic bacteria species. The latter ones, have been observed to serve multiple functions to the insect, i.e.
helping in insect reproduction1
, boosting the immune response2
, pheromone production3
, as well as nutrition, including the synthesis of essential amino acids4,
Due to the importance of these associations, many efforts have been made to characterize the communities down to the individual members. However, most of these efforts were either based on cultivation methods or relied on the generation of 16S rRNA gene fragments which were sequenced for final identification. Unfortunately, these approaches only identified the bacterial species present in the gut and provided no information on the metabolic activity of the microorganisms.
To characterize the metabolically active bacterial species in the gut of an insect, we used stable isotope probing (SIP) in vivo
C-glucose as a universal substrate. This is a promising culture-free technique that allows the linkage of microbial phylogenies to their particular metabolic activity. This is possible by tracking stable, isotope labeled atoms from substrates into microbial biomarkers, such as DNA and RNA5
. The incorporation of 13
C isotopes into DNA increases the density of the labeled DNA compared to the unlabeled (12
C) one. In the end, the 13
C-labeled DNA or RNA is separated by density-gradient ultracentrifugation from the 12
C-unlabeled similar one6
. Subsequent molecular analysis of the separated nucleic acid isotopomers provides the connection between metabolic activity and identity of the species.
Here, we present the protocol used to characterize the metabolically active bacteria in the gut of a generalist insect (our model system), Spodoptera littoralis
). The phylogenetic analysis of the DNA was done using pyrosequencing, which allowed high resolution and precision in the identification of insect gut bacterial community. As main substrate, 13
C-labeled glucose was used in the experiments. The substrate was fed to the insects using an artificial diet.
Microbiology, Issue 81, Insects, Sequence Analysis, Genetics, Microbial, Bacteria, Lepidoptera, Spodoptera littoralis, stable-isotope-probing (SIP), pyro-sequencing, 13C-glucose, gut, microbiota, bacteria
Competitive Genomic Screens of Barcoded Yeast Libraries
Institutions: University of Toronto, University of Toronto, University of Toronto, National Human Genome Research Institute, NIH, Stanford University , University of Toronto.
By virtue of advances in next generation sequencing technologies, we have access to new genome sequences almost daily. The tempo of these advances is accelerating, promising greater depth and breadth. In light of these extraordinary advances, the need for fast, parallel methods to define gene function becomes ever more important. Collections of genome-wide deletion mutants in yeasts and E. coli
have served as workhorses for functional characterization of gene function, but this approach is not scalable, current gene-deletion approaches require each of the thousands of genes that comprise a genome to be deleted and verified. Only after this work is complete can we pursue high-throughput phenotyping. Over the past decade, our laboratory has refined a portfolio of competitive, miniaturized, high-throughput genome-wide assays that can be performed in parallel. This parallelization is possible because of the inclusion of DNA 'tags', or 'barcodes,' into each mutant, with the barcode serving as a proxy for the mutation and one can measure the barcode abundance to assess mutant fitness. In this study, we seek to fill the gap between DNA sequence and barcoded mutant collections. To accomplish this we introduce a combined transposon disruption-barcoding approach that opens up parallel barcode assays to newly sequenced, but poorly characterized microbes. To illustrate this approach we present a new Candida albicans
barcoded disruption collection and describe how both microarray-based and next generation sequencing-based platforms can be used to collect 10,000 - 1,000,000 gene-gene and drug-gene interactions in a single experiment.
Biochemistry, Issue 54, chemical biology, chemogenomics, chemical probes, barcode microarray, next generation sequencing
A PCR-based Genotyping Method to Distinguish Between Wild-type and Ornamental Varieties of Imperata cylindrica
Institutions: The University of Alabama, Huntsville, Center for Plant Health Science and Technology.
Wild-type I. cylindrica
(cogongrass) is one of the top ten worst invasive plants in the world, negatively impacting agricultural and natural resources in 73 different countries throughout Africa, Asia, Europe, New Zealand, Oceania and the Americas1-2
. Cogongrass forms rapidly-spreading, monodominant stands that displace a large variety of native plant species and in turn threaten the native animals that depend on the displaced native plant species for forage and shelter. To add to the problem, an ornamental variety [I. cylindrica
(Retzius)] is widely marketed under the names of Imperata cylindrica
'Rubra', Red Baron, and Japanese blood grass (JBG). This variety is putatively sterile and noninvasive and is considered a desirable ornamental for its red-colored leaves. However, under the correct conditions, JBG can produce viable seed (Carol Holko, 2009 personal communication) and can revert to a green invasive form that is often indistinguishable from cogongrass as it takes on the distinguishing characteristics of the wild-type invasive variety4
). This makes identification using morphology a difficult task even for well-trained plant taxonomists. Reversion of JBG to an aggressive green phenotype is also not a rare occurrence. Using sequence comparisons of coding and variable regions in both nuclear and chloroplast DNA, we have confirmed that JBG has reverted to the green invasive within the states of Maryland, South Carolina, and Missouri. JBG has been sold and planted in just about every state in the continental U.S. where there is not an active cogongrass infestation. The extent of the revert problem in not well understood because reverted plants are undocumented and often destroyed.
Application of this molecular protocol provides a method to identify JBG reverts and can help keep these varieties from co-occurring and possibly hybridizing. Cogongrass is an obligate outcrosser and, when crossed with a different genotype, can produce viable wind-dispersed seeds that spread cogongrass over wide distances5-7
. JBG has a slightly different genotype than cogongrass and may be able to form viable hybrids with cogongrass. To add to the problem, JBG is more cold and shade tolerant than cogongrass8-10
, and gene flow between these two varieties is likely to generate hybrids that are more aggressive, shade tolerant, and cold hardy than wild-type cogongrass. While wild-type cogongrass currently infests over 490 million hectares worldwide, in the Southeast U.S. it infests over 500,000 hectares and is capable of occupying most of the U.S. as it rapidly spreads northward due to its broad niche and geographic potential3,7,11
. The potential of a genetic crossing is a serious concern for the USDA-APHIS Federal Noxious Week Program. Currently, the USDA-APHIS prohibits JBG in states where there are major cogongrass infestations (e.g., Florida, Alabama, Mississippi). However, preventing the two varieties from combining can prove more difficult as cogongrass and JBG expand their distributions. Furthermore, the distribution of the JBG revert is currently unknown and without the ability to identify these varieties through morphology, some cogongrass infestations may be the result of JBG reverts. Unfortunately, current molecular methods of identification typically rely on AFLP (Amplified Fragment Length Polymorphisms) and DNA sequencing, both of which are time consuming and costly. Here, we present the first cost-effective and reliable PCR-based molecular genotyping method to accurately distinguish between cogongrass and JBG revert.
Molecular Biology, Issue 60, Molecular genotyping, Japanese blood grass, Red Baron, cogongrass, invasive plants
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Polymerase Chain Reaction: Basic Protocol Plus Troubleshooting and Optimization Strategies
Institutions: University of California, Los Angeles .
In the biological sciences there have been technological advances that catapult the discipline into golden ages of discovery. For example, the field of microbiology was transformed with the advent of Anton van Leeuwenhoek's microscope, which allowed scientists to visualize prokaryotes for the first time. The development of the polymerase chain reaction (PCR) is one of those innovations that changed the course of molecular science with its impact spanning countless subdisciplines in biology. The theoretical process was outlined by Keppe and coworkers in 1971; however, it was another 14 years until the complete PCR procedure was described and experimentally applied by Kary Mullis while at Cetus Corporation in 1985. Automation and refinement of this technique progressed with the introduction of a thermal stable DNA polymerase from the bacterium Thermus aquaticus
, consequently the name Taq
PCR is a powerful amplification technique that can generate an ample supply of a specific segment of DNA (i.e., an amplicon) from only a small amount of starting material (i.e., DNA template or target sequence). While straightforward and generally trouble-free, there are pitfalls that complicate the reaction producing spurious results. When PCR fails it can lead to many non-specific DNA products of varying sizes that appear as a ladder or smear of bands on agarose gels. Sometimes no products form at all. Another potential problem occurs when mutations are unintentionally introduced in the amplicons, resulting in a heterogeneous population of PCR products. PCR failures can become frustrating unless patience and careful troubleshooting are employed to sort out and solve the problem(s). This protocol outlines the basic principles of PCR, provides a methodology that will result in amplification of most target sequences, and presents strategies for optimizing a reaction. By following this PCR guide, students should be able to:
● Set up reactions and thermal cycling conditions for a conventional PCR experiment
● Understand the function of various reaction components and their overall effect on a PCR experiment
● Design and optimize a PCR experiment for any DNA template
● Troubleshoot failed PCR experiments
Basic Protocols, Issue 63, PCR, optimization, primer design, melting temperature, Tm, troubleshooting, additives, enhancers, template DNA quantification, thermal cycler, molecular biology, genetics
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Simultaneous Multicolor Imaging of Biological Structures with Fluorescence Photoactivation Localization Microscopy
Institutions: University of Maine.
Localization-based super resolution microscopy can be applied to obtain a spatial map (image) of the distribution of individual fluorescently labeled single molecules within a sample with a spatial resolution of tens of nanometers. Using either photoactivatable (PAFP) or photoswitchable (PSFP) fluorescent proteins fused to proteins of interest, or organic dyes conjugated to antibodies or other molecules of interest, fluorescence photoactivation localization microscopy (FPALM) can simultaneously image multiple species of molecules within single cells. By using the following approach, populations of large numbers (thousands to hundreds of thousands) of individual molecules are imaged in single cells and localized with a precision of ~10-30 nm. Data obtained can be applied to understanding the nanoscale spatial distributions of multiple protein types within a cell. One primary advantage of this technique is the dramatic increase in spatial resolution: while diffraction limits resolution to ~200-250 nm in conventional light microscopy, FPALM can image length scales more than an order of magnitude smaller. As many biological hypotheses concern the spatial relationships among different biomolecules, the improved resolution of FPALM can provide insight into questions of cellular organization which have previously been inaccessible to conventional fluorescence microscopy. In addition to detailing the methods for sample preparation and data acquisition, we here describe the optical setup for FPALM. One additional consideration for researchers wishing to do super-resolution microscopy is cost: in-house setups are significantly cheaper than most commercially available imaging machines. Limitations of this technique include the need for optimizing the labeling of molecules of interest within cell samples, and the need for post-processing software to visualize results. We here describe the use of PAFP and PSFP expression to image two protein species in fixed cells. Extension of the technique to living cells is also described.
Basic Protocol, Issue 82, Microscopy, Super-resolution imaging, Multicolor, single molecule, FPALM, Localization microscopy, fluorescent proteins
Unraveling the Unseen Players in the Ocean - A Field Guide to Water Chemistry and Marine Microbiology
Institutions: San Diego State University, University of California San Diego.
Here we introduce a series of thoroughly tested and well standardized research protocols adapted for use in remote marine environments. The sampling protocols include the assessment of resources available to the microbial community (dissolved organic carbon, particulate organic matter, inorganic nutrients), and a comprehensive description of the viral and bacterial communities (via direct viral and microbial counts, enumeration of autofluorescent microbes, and construction of viral and microbial metagenomes). We use a combination of methods, which represent a dispersed field of scientific disciplines comprising already established protocols and some of the most recent techniques developed. Especially metagenomic sequencing techniques used for viral and bacterial community characterization, have been established only in recent years, and are thus still subjected to constant improvement. This has led to a variety of sampling and sample processing procedures currently in use. The set of methods presented here provides an up to date approach to collect and process environmental samples. Parameters addressed with these protocols yield the minimum on information essential to characterize and understand the underlying mechanisms of viral and microbial community dynamics. It gives easy to follow guidelines to conduct comprehensive surveys and discusses critical steps and potential caveats pertinent to each technique.
Environmental Sciences, Issue 93, dissolved organic carbon, particulate organic matter, nutrients, DAPI, SYBR, microbial metagenomics, viral metagenomics, marine environment
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Principles of Site-Specific Recombinase (SSR) Technology
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Site-specific recombinase (SSR) technology allows the manipulation of gene structure to explore gene function and has become an integral tool of molecular biology. Site-specific recombinases are proteins that bind to distinct DNA target sequences. The Cre/lox system was first described in bacteriophages during the 1980's. Cre recombinase is a Type I topoisomerase that catalyzes site-specific recombination of DNA between two loxP (locus of X-over P1) sites. The Cre/lox system does not require any cofactors. LoxP sequences contain distinct binding sites for Cre recombinases that surround a directional core sequence where recombination and rearrangement takes place. When cells contain loxP sites and express the Cre recombinase, a recombination event occurs. Double-stranded DNA is cut at both loxP sites by the Cre recombinase, rearranged, and ligated ("scissors and glue"). Products of the recombination event depend on the relative orientation of the asymmetric sequences.
SSR technology is frequently used as a tool to explore gene function. Here the gene of interest is flanked with Cre target sites loxP ("floxed"). Animals are then crossed with animals expressing the Cre recombinase under the control of a tissue-specific promoter. In tissues that express the Cre recombinase it binds to target sequences and excises the floxed gene. Controlled gene deletion allows the investigation of gene function in specific tissues and at distinct time points. Analysis of gene function employing SSR technology --- conditional mutagenesis -- has significant advantages over traditional knock-outs where gene deletion is frequently lethal.
Cellular Biology, Issue 15, Molecular Biology, Site-Specific Recombinase, Cre recombinase, Cre/lox system, transgenic animals, transgenic technology
Electroporation of Mycobacteria
Institutions: Barts and the London School of Medicine and Dentistry, Barts and the London School of Medicine and Dentistry.
High efficiency transformation is a major limitation in the study of mycobacteria. The genus Mycobacterium can be difficult to transform; this is mainly caused by the thick and waxy cell wall, but is compounded by the fact that most molecular techniques have been developed for distantly-related species such as Escherichia coli and Bacillus subtilis. In spite of these obstacles, mycobacterial plasmids have been identified and DNA transformation of many mycobacterial species have now been described. The most successful method for introducing DNA into mycobacteria is electroporation. Many parameters contribute to successful transformation; these include the species/strain, the nature of the transforming DNA, the selectable marker used, the growth medium, and the conditions for the electroporation pulse. Optimized methods for the transformation of both slow- and fast-grower are detailed here. Transformation efficiencies for different mycobacterial species and with various selectable markers are reported.
Microbiology, Issue 15, Springer Protocols, Mycobacteria, Electroporation, Bacterial Transformation, Transformation Efficiency, Bacteria, Tuberculosis, M. Smegmatis, Springer Protocols
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif