Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
24 Related JoVE Articles!
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Experimental Protocol for Manipulating Plant-induced Soil Heterogeneity
Institutions: Case Western Reserve University.
Coexistence theory has often treated environmental heterogeneity as being independent of the community composition; however biotic feedbacks such as plant-soil feedbacks (PSF) have large effects on plant performance, and create environmental heterogeneity that depends on the community composition. Understanding the importance of PSF for plant community assembly necessitates understanding of the role of heterogeneity in PSF, in addition to mean PSF effects. Here, we describe a protocol for manipulating plant-induced soil heterogeneity. Two example experiments are presented: (1) a field experiment with a 6-patch grid of soils to measure plant population responses and (2) a greenhouse experiment with 2-patch soils to measure individual plant responses. Soils can be collected from the zone of root influence (soils from the rhizosphere and directly adjacent to the rhizosphere) of plants in the field from conspecific and heterospecific plant species. Replicate collections are used to avoid pseudoreplicating soil samples. These soils are then placed into separate patches for heterogeneous treatments or mixed for a homogenized treatment. Care should be taken to ensure that heterogeneous and homogenized treatments experience the same degree of soil disturbance. Plants can then be placed in these soil treatments to determine the effect of plant-induced soil heterogeneity on plant performance. We demonstrate that plant-induced heterogeneity results in different outcomes than predicted by traditional coexistence models, perhaps because of the dynamic nature of these feedbacks. Theory that incorporates environmental heterogeneity influenced by the assembling community and additional empirical work is needed to determine when heterogeneity intrinsic to the assembling community will result in different assembly outcomes compared with heterogeneity extrinsic to the community composition.
Environmental Sciences, Issue 85, Coexistence, community assembly, environmental drivers, plant-soil feedback, soil heterogeneity, soil microbial communities, soil patch
Single-plant, Sterile Microcosms for Nodulation and Growth of the Legume Plant Medicago truncatula with the Rhizobial Symbiont Sinorhizobium meliloti
Institutions: Florida State University.
Rhizobial bacteria form symbiotic, nitrogen-fixing nodules on the roots of compatible host legume plants. One of the most well-developed model systems for studying these interactions is the plant Medicago truncatula
cv. Jemalong A17 and the rhizobial bacterium Sinorhizobium meliloti
1021. Repeated imaging of plant roots and scoring of symbiotic phenotypes requires methods that are non-destructive to either plants or bacteria. The symbiotic phenotypes of some plant and bacterial mutants become apparent after relatively short periods of growth, and do not require long-term observation of the host/symbiont interaction. However, subtle differences in symbiotic efficiency and nodule senescence phenotypes that are not apparent in the early stages of the nodulation process require relatively long growth periods before they can be scored. Several methods have been developed for long-term growth and observation of this host/symbiont pair. However, many of these methods require repeated watering, which increases the possibility of contamination by other microbes. Other methods require a relatively large space for growth of large numbers of plants. The method described here, symbiotic growth of M. truncatula/S. meliloti
in sterile, single-plant microcosms, has several advantages. Plants in these microcosms have sufficient moisture and nutrients to ensure that watering is not required for up to 9 weeks, preventing cross-contamination during watering. This allows phenotypes to be quantified that might be missed in short-term growth systems, such as subtle delays in nodule development and early nodule senescence. Also, the roots and nodules in the microcosm are easily viewed through the plate lid, so up-rooting of the plants for observation is not required.
Environmental Sciences, Issue 80, Plant Roots, Medicago, Gram-Negative Bacteria, Nitrogen, Microbiological Techniques, Bacterial Processes, Symbiosis, botany, microbiology, Medicago truncatula, Sinorhizobium meliloti, nodule, nitrogen fixation, legume, rhizobia, bacteria
Design and Operation of a Continuous 13C and 15N Labeling Chamber for Uniform or Differential, Metabolic and Structural, Plant Isotope Labeling
Institutions: Colorado State University, USDA-ARS, Colorado State University.
Tracing rare stable isotopes from plant material through the ecosystem provides the most sensitive information about ecosystem processes; from CO2
fluxes and soil organic matter formation to small-scale stable-isotope biomarker probing. Coupling multiple stable isotopes such as 13
C with 15
O or 2
H has the potential to reveal even more information about complex stoichiometric relationships during biogeochemical transformations. Isotope labeled plant material has been used in various studies of litter decomposition and soil organic matter formation1-4
. From these and other studies, however, it has become apparent that structural components of plant material behave differently than metabolic components (i.e
. leachable low molecular weight compounds) in terms of microbial utilization and long-term carbon storage5-7
. The ability to study structural and metabolic components separately provides a powerful new tool for advancing the forefront of ecosystem biogeochemical studies. Here we describe a method for producing 13
C and 15
N labeled plant material that is either uniformly labeled throughout the plant or differentially labeled in structural and metabolic plant components.
Here, we present the construction and operation of a continuous 13
C and 15
N labeling chamber that can be modified to meet various research needs. Uniformly labeled plant material is produced by continuous labeling from seedling to harvest, while differential labeling is achieved by removing the growing plants from the chamber weeks prior to harvest. Representative results from growing Andropogon gerardii
Kaw demonstrate the system's ability to efficiently label plant material at the targeted levels. Through this method we have produced plant material with a 4.4 atom%13
C and 6.7 atom%15
N uniform plant label, or material that is differentially labeled by up to 1.29 atom%13
C and 0.56 atom%15
N in its metabolic and structural components (hot water extractable and hot water residual components, respectively). Challenges lie in maintaining proper temperature, humidity, CO2
concentration, and light levels in an airtight 13
atmosphere for successful plant production. This chamber description represents a useful research tool to effectively produce uniformly or differentially multi-isotope labeled plant material for use in experiments on ecosystem biogeochemical cycling.
Environmental Sciences, Issue 83, 13C, 15N, plant, stable isotope labeling, Andropogon gerardii, metabolic compounds, structural compounds, hot water extraction
Optimization and Utilization of Agrobacterium-mediated Transient Protein Production in Nicotiana
Institutions: Fraunhofer USA Center for Molecular Biotechnology.
-mediated transient protein production in plants is a promising approach to produce vaccine antigens and therapeutic proteins within a short period of time. However, this technology is only just beginning to be applied to large-scale production as many technological obstacles to scale up are now being overcome. Here, we demonstrate a simple and reproducible method for industrial-scale transient protein production based on vacuum infiltration of Nicotiana
plants with Agrobacteria
carrying launch vectors. Optimization of Agrobacterium
cultivation in AB medium allows direct dilution of the bacterial culture in Milli-Q water, simplifying the infiltration process. Among three tested species of Nicotiana
, N. excelsiana
× N. excelsior
) was selected as the most promising host due to the ease of infiltration, high level of reporter protein production, and about two-fold higher biomass production under controlled environmental conditions. Induction of Agrobacterium
harboring pBID4-GFP (Tobacco mosaic virus
-based) using chemicals such as acetosyringone and monosaccharide had no effect on the protein production level. Infiltrating plant under 50 to 100 mbar for 30 or 60 sec resulted in about 95% infiltration of plant leaf tissues. Infiltration with Agrobacterium
laboratory strain GV3101 showed the highest protein production compared to Agrobacteria
laboratory strains LBA4404 and C58C1 and wild-type Agrobacteria
strains at6, at10, at77 and A4. Co-expression of a viral RNA silencing suppressor, p23 or p19, in N. benthamiana
resulted in earlier accumulation and increased production (15-25%) of target protein (influenza virus hemagglutinin).
Plant Biology, Issue 86, Agroinfiltration, Nicotiana benthamiana, transient protein production, plant-based expression, viral vector, Agrobacteria
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Efficient Agroinfiltration of Plants for High-level Transient Expression of Recombinant Proteins
Institutions: Arizona State University .
Mammalian cell culture is the major platform for commercial production of human vaccines and therapeutic proteins. However, it cannot meet the increasing worldwide demand for pharmaceuticals due to its limited scalability and high cost. Plants have shown to be one of the most promising alternative pharmaceutical production platforms that are robust, scalable, low-cost and safe. The recent development of virus-based vectors has allowed rapid and high-level transient expression of recombinant proteins in plants. To further optimize the utility of the transient expression system, we demonstrate a simple, efficient and scalable methodology to introduce target-gene containing Agrobacterium
into plant tissue in this study. Our results indicate that agroinfiltration with both syringe and vacuum methods have resulted in the efficient introduction of Agrobacterium
into leaves and robust production of two fluorescent proteins; GFP and DsRed. Furthermore, we demonstrate the unique advantages offered by both methods. Syringe infiltration is simple and does not need expensive equipment. It also allows the flexibility to either infiltrate the entire leave with one target gene, or to introduce genes of multiple targets on one leaf. Thus, it can be used for laboratory scale expression of recombinant proteins as well as for comparing different proteins or vectors for yield or expression kinetics. The simplicity of syringe infiltration also suggests its utility in high school and college education for the subject of biotechnology. In contrast, vacuum infiltration is more robust and can be scaled-up for commercial manufacture of pharmaceutical proteins. It also offers the advantage of being able to agroinfiltrate plant species that are not amenable for syringe infiltration such as lettuce and Arabidopsis
. Overall, the combination of syringe and vacuum agroinfiltration provides researchers and educators a simple, efficient, and robust methodology for transient protein expression. It will greatly facilitate the development of pharmaceutical proteins and promote science education.
Plant Biology, Issue 77, Genetics, Molecular Biology, Cellular Biology, Virology, Microbiology, Bioengineering, Plant Viruses, Antibodies, Monoclonal, Green Fluorescent Proteins, Plant Proteins, Recombinant Proteins, Vaccines, Synthetic, Virus-Like Particle, Gene Transfer Techniques, Gene Expression, Agroinfiltration, plant infiltration, plant-made pharmaceuticals, syringe agroinfiltration, vacuum agroinfiltration, monoclonal antibody, Agrobacterium tumefaciens, Nicotiana benthamiana, GFP, DsRed, geminiviral vectors, imaging, plant model
A Rapid and Efficient Method for Assessing Pathogenicity of Ustilago maydis on Maize and Teosinte Lines
Institutions: University of Georgia.
Maize is a major cereal crop worldwide. However, susceptibility to biotrophic pathogens is the primary constraint to increasing productivity. U. maydis
is a biotrophic fungal pathogen and the causal agent of corn smut on maize. This disease is responsible for significant yield losses of approximately $1.0 billion annually in the U.S.1
Several methods including crop rotation, fungicide application and seed treatments are currently used to control corn smut2
. However, host resistance is the only practical method for managing corn smut. Identification of crop plants including maize, wheat, and rice that are resistant to various biotrophic pathogens has significantly decreased yield losses annually3-5
. Therefore, the use of a pathogen inoculation method that efficiently and reproducibly delivers the pathogen in between the plant leaves, would facilitate the rapid identification of maize lines that are resistant to U. maydis
. As, a first step toward indentifying maize lines that are resistant to U. maydis
, a needle injection inoculation method and a resistance reaction screening method was utilized to inoculate maize, teosinte, and maize x teosinte introgression lines with a U. maydis
strain and to select resistant plants.
Maize, teosinte and maize x teosinte introgression lines, consisting of about 700 plants, were planted, inoculated with a strain of U. maydis
, and screened for resistance. The inoculation and screening methods successfully identified three teosinte lines resistant to U. maydis
. Here a detailed needle injection inoculation and resistance reaction screening protocol for maize, teosinte, and maize x teosinte introgression lines is presented. This study demonstrates that needle injection inoculation is an invaluable tool in agriculture that can efficiently deliver U. maydis
in between the plant leaves and has provided plant lines that are resistant to U. maydis
that can now be combined and tested in breeding programs for improved disease resistance.
Environmental Sciences, Issue 83, Bacterial Infections, Signs and Symptoms, Eukaryota, Plant Physiological Phenomena, Ustilago maydis, needle injection inoculation, disease rating scale, plant-pathogen interactions
Non-radioactive in situ Hybridization Protocol Applicable for Norway Spruce and a Range of Plant Species
Institutions: Uppsala University, Swedish University of Agricultural Sciences.
The high-throughput expression analysis technologies available today give scientists an overflow of expression profiles but their resolution in terms of tissue specific expression is limited because of problems in dissecting individual tissues. Expression data needs to be confirmed and complemented with expression patterns using e.g. in situ
hybridization, a technique used to localize cell specific mRNA expression. The in situ
hybridization method is laborious, time-consuming and often requires extensive optimization depending on species and tissue. In situ
experiments are relatively more difficult to perform in woody species such as the conifer Norway spruce (Picea abies
). Here we present a modified DIG in situ
hybridization protocol, which is fast and applicable on a wide range of plant species including P. abies
. With just a few adjustments, including altered RNase treatment and proteinase K concentration, we could use the protocol to study tissue specific expression of homologous genes in male reproductive organs of one gymnosperm and two angiosperm species; P. abies, Arabidopsis thaliana
and Brassica napus
. The protocol worked equally well for the species and genes studied. AtAP3
were observed in second and third whorl floral organs in A. thaliana
and B. napus
and DAL13 in microsporophylls of male cones from P. abies
. For P. abies
the proteinase K concentration, used to permeablize the tissues, had to be increased to 3 g/ml instead of 1 g/ml, possibly due to more compact tissues and higher levels of phenolics and polysaccharides. For all species the RNase treatment was removed due to reduced signal strength without a corresponding increase in specificity. By comparing tissue specific expression patterns of homologous genes from both flowering plants and a coniferous tree we demonstrate that the DIG in situ
protocol presented here, with only minute adjustments, can be applied to a wide range of plant species. Hence, the protocol avoids both extensive species specific optimization and the laborious use of radioactively labeled probes in favor of DIG labeled probes. We have chosen to illustrate the technically demanding steps of the protocol in our film.
Anna Karlgren and Jenny Carlsson contributed equally to this study.
Corresponding authors: Anna Karlgren at Anna.Karlgren@ebc.uu.se and Jens F. Sundström at Jens.Sundstrom@vbsg.slu.se
Plant Biology, Issue 26, RNA, expression analysis, Norway spruce, Arabidopsis, rapeseed, conifers
Methods for Performing Crosses in Setaria viridis, a New Model System for the Grasses
Institutions: Donald Danforth Plant Science Center, Boyce Thompson Institute.
is an emerging model system for C4
grasses. It is closely related to the bioenergy feed stock switchgrass and the grain crop foxtail millet. Recently, the 510 Mb genome of foxtail millet, S. italica,
has been sequenced 1,2
and a 25x coverage genome sequence of the weedy relative S. viridis
is in progress. S. viridis
has a number of characteristics that make it a potentially excellent model genetic system including a rapid generation time, small stature, simple growth requirements, prolific seed production 3
and developed systems for both transient and stable transformation 4
. However, the genetics of S. viridis
is largely unexplored, in part, due to the lack of detailed methods for performing crosses. To date, no standard protocol has been adopted that will permit rapid production of seeds from controlled crosses.
The protocol presented here is optimized for performing genetic crosses in S. viridis
, accession A10.1. We have employed a simple heat treatment with warm water for emasculation after pruning the panicle to retain 20-30 florets and labeling of flowers to eliminate seeds resulting from newly developed flowers after emasculation. After testing a series of heat treatments at permissive temperatures and varying the duration of dipping, we have established an optimum temperature and time range of 48 °C for 3-6 min. By using this method, a minimum of 15 crosses can be performed by a single worker per day and an average of 3-5 outcross progeny per panicle can be recovered. Therefore, an average of 45-75 outcross progeny can be produced by one person in a single day. Broad implementation of this technique will facilitate the development of recombinant inbred line populations of S. viridis
X S. viridis
or S. viridis
X S. italica
, mapping mutations through bulk segregant analysis and creating higher order mutants for genetic analysis.
Environmental Sciences, Issue 80, Hybridization, Genetics, plants, Setaria viridis, crosses, emasculation, flowering, seed propagation, seed dormancy
Monitoring of Systemic and Hepatic Hemodynamic Parameters in Mice
Institutions: Jena University Hospital, Jena University Hospital, The First Affiliated Hospital of Wenzhou Medical University.
The use of mouse models in experimental research is of enormous importance for the study of hepatic physiology and pathophysiological disturbances. However, due to the small size of the mouse, technical details of the intraoperative monitoring procedure suitable for the mouse were rarely described. Previously we have reported a monitoring procedure to obtain hemodynamic parameters for rats. Now, we adapted the procedure to acquire systemic and hepatic hemodynamic parameters in mice, a species ten-fold smaller than rats. This film demonstrates the instrumentation of the animals as well as the data acquisition process needed to assess systemic and hepatic hemodynamics in mice. Vital parameters, including body temperature, respiratory rate and heart rate were recorded throughout the whole procedure. Systemic hemodynamic parameters consist of carotid artery pressure (CAP) and central venous pressure (CVP). Hepatic perfusion parameters include portal vein pressure (PVP), portal flow rate as well as the flow rate of the common hepatic artery (table 1). Instrumentation and data acquisition to record the normal values was completed within 1.5 h. Systemic and hepatic hemodynamic parameters remained within normal ranges during this procedure.
This procedure is challenging but feasible. We have already applied this procedure to assess hepatic hemodynamics in normal mice as well as during 70% partial hepatectomy and in liver lobe clamping experiments. Mean PVP after resection (n= 20), was 11.41±2.94 cmH2
O which was significantly higher (P<0.05) than before resection (6.87±2.39 cmH2
O). The results of liver lobe clamping experiment indicated that this monitoring procedure is sensitive and suitable for detecting small changes in portal pressure and portal flow rate. In conclusion, this procedure is reliable in the hands of an experienced micro-surgeon but should be limited to experiments where mice are absolutely needed.
Medicine, Issue 92, mice, hemodynamics, hepatic perfusion, CAP, CVP, surgery, intraoperative monitoring, portal vein pressure, blood flow
Generation of Comprehensive Thoracic Oncology Database - Tool for Translational Research
Institutions: University of Chicago, University of Chicago, Northshore University Health Systems, University of Chicago, University of Chicago, University of Chicago.
The Thoracic Oncology Program Database Project was created to serve as a comprehensive, verified, and accessible repository for well-annotated cancer specimens and clinical data to be available to researchers within the Thoracic Oncology Research Program. This database also captures a large volume of genomic and proteomic data obtained from various tumor tissue studies. A team of clinical and basic science researchers, a biostatistician, and a bioinformatics expert was convened to design the database. Variables of interest were clearly defined and their descriptions were written within a standard operating manual to ensure consistency of data annotation. Using a protocol for prospective tissue banking and another protocol for retrospective banking, tumor and normal tissue samples from patients consented to these protocols were collected. Clinical information such as demographics, cancer characterization, and treatment plans for these patients were abstracted and entered into an Access database. Proteomic and genomic data have been included in the database and have been linked to clinical information for patients described within the database. The data from each table were linked using the relationships function in Microsoft Access to allow the database manager to connect clinical and laboratory information during a query. The queried data can then be exported for statistical analysis and hypothesis generation.
Medicine, Issue 47, Database, Thoracic oncology, Bioinformatics, Biorepository, Microsoft Access, Proteomics, Genomics
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
DNA-affinity-purified Chip (DAP-chip) Method to Determine Gene Targets for Bacterial Two component Regulatory Systems
Institutions: Lawrence Berkeley National Laboratory.
methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro
microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris
Genetics, Issue 89, DNA-Affinity-Purified-chip, response regulator, transcription factor binding site, two component system, signal transduction, Desulfovibrio, lactate utilization regulator, ChIP-chip
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
In Vivo Modeling of the Morbid Human Genome using Danio rerio
Institutions: Duke University Medical Center, Duke University, Duke University Medical Center.
Here, we present methods for the development of assays to query potentially clinically significant nonsynonymous changes using in vivo
complementation in zebrafish. Zebrafish (Danio rerio
) are a useful animal system due to their experimental tractability; embryos are transparent to enable facile viewing, undergo rapid development ex vivo,
and can be genetically manipulated.1
These aspects have allowed for significant advances in the analysis of embryogenesis, molecular processes, and morphogenetic signaling. Taken together, the advantages of this vertebrate model make zebrafish highly amenable to modeling the developmental defects in pediatric disease, and in some cases, adult-onset disorders. Because the zebrafish genome is highly conserved with that of humans (~70% orthologous), it is possible to recapitulate human disease states in zebrafish. This is accomplished either through the injection of mutant human mRNA to induce dominant negative or gain of function alleles, or utilization of morpholino (MO) antisense oligonucleotides to suppress genes to mimic loss of function variants. Through complementation of MO-induced phenotypes with capped human mRNA, our approach enables the interpretation of the deleterious effect of mutations on human protein sequence based on the ability of mutant mRNA to rescue a measurable, physiologically relevant phenotype. Modeling of the human disease alleles occurs through microinjection of zebrafish embryos with MO and/or human mRNA at the 1-4 cell stage, and phenotyping up to seven days post fertilization (dpf). This general strategy can be extended to a wide range of disease phenotypes, as demonstrated in the following protocol. We present our established models for morphogenetic signaling, craniofacial, cardiac, vascular integrity, renal function, and skeletal muscle disorder phenotypes, as well as others.
Molecular Biology, Issue 78, Genetics, Biomedical Engineering, Medicine, Developmental Biology, Biochemistry, Anatomy, Physiology, Bioengineering, Genomics, Medical, zebrafish, in vivo, morpholino, human disease modeling, transcription, PCR, mRNA, DNA, Danio rerio, animal model
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
Identification of Protein Complexes in Escherichia coli using Sequential Peptide Affinity Purification in Combination with Tandem Mass Spectrometry
Institutions: University of Toronto, University of Regina, University of Toronto.
Since most cellular processes are mediated by macromolecular assemblies, the systematic identification of protein-protein interactions (PPI) and the identification of the subunit composition of multi-protein complexes can provide insight into gene function and enhance understanding of biological systems1, 2
. Physical interactions can be mapped with high confidence vialarge-scale isolation and characterization of endogenous protein complexes under near-physiological conditions based on affinity purification of chromosomally-tagged proteins in combination with mass spectrometry (APMS). This approach has been successfully applied in evolutionarily diverse organisms, including yeast, flies, worms, mammalian cells, and bacteria1-6
. In particular, we have generated a carboxy-terminal Sequential Peptide Affinity (SPA) dual tagging system for affinity-purifying native protein complexes from cultured gram-negative Escherichia coli
, using genetically-tractable host laboratory strains that are well-suited for genome-wide investigations of the fundamental biology and conserved processes of prokaryotes1, 2, 7
. Our SPA-tagging system is analogous to the tandem affinity purification method developed originally for yeast8, 9
, and consists of a calmodulin binding peptide (CBP) followed by the cleavage site for the highly specific tobacco etch virus
(TEV) protease and three copies of the FLAG epitope (3X FLAG), allowing for two consecutive rounds of affinity enrichment. After cassette amplification, sequence-specific linear PCR products encoding the SPA-tag and a selectable marker are integrated and expressed in frame as carboxy-terminal fusions in a DY330 background that is induced to transiently express a highly efficient heterologous bacteriophage lambda recombination system10
. Subsequent dual-step purification using calmodulin and anti-FLAG affinity beads enables the highly selective and efficient recovery of even low abundance protein complexes from large-scale cultures. Tandem mass spectrometry is then used to identify the stably co-purifying proteins with high sensitivity (low nanogram detection limits).
Here, we describe detailed step-by-step procedures we commonly use for systematic protein tagging, purification and mass spectrometry-based analysis of soluble protein complexes from E. coli
, which can be scaled up and potentially tailored to other bacterial species, including certain opportunistic pathogens that are amenable to recombineering. The resulting physical interactions can often reveal interesting unexpected components and connections suggesting novel mechanistic links. Integration of the PPI data with alternate molecular association data such as genetic (gene-gene) interactions and genomic-context (GC) predictions can facilitate elucidation of the global molecular organization of multi-protein complexes within biological pathways. The networks generated for E. coli
can be used to gain insight into the functional architecture of orthologous gene products in other microbes for which functional annotations are currently lacking.
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, affinity purification, Escherichia coli, gram-negative bacteria, cytosolic proteins, SPA-tagging, homologous recombination, mass spectrometry, protein interaction, protein complex
Use of Arabidopsis eceriferum Mutants to Explore Plant Cuticle Biosynthesis
Institutions: University of British Columbia - UBC, University of British Columbia - UBC.
The plant cuticle is a waxy outer covering on plants that has a primary role in water conservation, but is also an important barrier against the entry of pathogenic microorganisms. The cuticle is made up of a tough crosslinked polymer called "cutin" and a protective wax layer that seals the plant surface. The waxy layer of the cuticle is obvious on many plants, appearing as a shiny film on the ivy leaf or as a dusty outer covering on the surface of a grape or a cabbage leaf thanks to light scattering crystals present in the wax. Because the cuticle is an essential adaptation of plants to a terrestrial environment, understanding the genes involved in plant cuticle formation has applications in both agriculture and forestry. Today, we'll show the analysis of plant cuticle mutants identified by forward and reverse genetics approaches.
Plant Biology, Issue 16, Annual Review, Cuticle, Arabidopsis, Eceriferum Mutants, Cryso-SEM, Gas Chromatography
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Choice and No-Choice Assays for Testing the Resistance of A. thaliana to Chewing Insects
Institutions: Cornell University.
Larvae of the small white cabbage butterfly are a pest in agricultural settings. This caterpillar species feeds from plants in the cabbage family, which include many crops such as cabbage, broccoli, Brussel sprouts etc. Rearing of the insects takes place on cabbage plants in the greenhouse. At least two cages are needed for the rearing of Pieris rapae. One for the larvae and the other to contain the adults, the butterflies. In order to investigate the role of plant hormones and toxic plant chemicals in resistance to this insect pest, we demonstrate two experiments. First, determination of the role of jasmonic acid (JA - a plant hormone often indicated in resistance to insects) in resistance to the chewing insect Pieris rapae. Caterpillar growth can be compared on wild-type and mutant plants impaired in production of JA. This experiment is considered "No Choice", because larvae are forced to subsist on a single plant which synthesizes or is deficient in JA. Second, we demonstrate an experiment that investigates the role of glucosinolates, which are used as oviposition (egg-laying) signals. Here, we use WT and mutant Arabidopsis impaired in glucosinolate production in a "Choice" experiment in which female butterflies are allowed to choose to lay their eggs on plants of either genotype. This video demonstrates the experimental setup for both assays as well as representative results.
Plant Biology, Issue 15, Annual Review, Plant Resistance, Herbivory, Arabidopsis thaliana, Pieris rapae, Caterpillars, Butterflies, Jasmonic Acid, Glucosinolates