The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
20 Related JoVE Articles!
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
In Situ Neutron Powder Diffraction Using Custom-made Lithium-ion Batteries
Institutions: University of Sydney, University of Wollongong, Australian Synchrotron, Australian Nuclear Science and Technology Organisation, University of Wollongong, University of New South Wales.
Li-ion batteries are widely used in portable electronic devices and are considered as promising candidates for higher-energy applications such as electric vehicles.1,2
However, many challenges, such as energy density and battery lifetimes, need to be overcome before this particular battery technology can be widely implemented in such applications.3
This research is challenging, and we outline a method to address these challenges using in situ
NPD to probe the crystal structure of electrodes undergoing electrochemical cycling (charge/discharge) in a battery. NPD data help determine the underlying structural mechanism responsible for a range of electrode properties, and this information can direct the development of better electrodes and batteries.
We briefly review six types of battery designs custom-made for NPD experiments and detail the method to construct the ‘roll-over’ cell that we have successfully used on the high-intensity NPD instrument, WOMBAT, at the Australian Nuclear Science and Technology Organisation (ANSTO). The design considerations and materials used for cell construction are discussed in conjunction with aspects of the actual in situ
NPD experiment and initial directions are presented on how to analyze such complex in situ
Physics, Issue 93, In operando, structure-property relationships, electrochemical cycling, electrochemical cells, crystallography, battery performance
Optimized Negative Staining: a High-throughput Protocol for Examining Small and Asymmetric Protein Structure by Electron Microscopy
Institutions: The Molecular Foundry.
Structural determination of proteins is rather challenging for proteins with molecular masses between 40 - 200 kDa. Considering that more than half of natural proteins have a molecular mass between 40 - 200 kDa1,2
, a robust and high-throughput method with a nanometer resolution capability is needed. Negative staining (NS) electron microscopy (EM) is an easy, rapid, and qualitative approach which has frequently been used in research laboratories to examine protein structure and protein-protein interactions. Unfortunately, conventional NS protocols often generate structural artifacts on proteins, especially with lipoproteins that usually form presenting rouleaux artifacts. By using images of lipoproteins from cryo-electron microscopy (cryo-EM) as a standard, the key parameters in NS specimen preparation conditions were recently screened and reported as the optimized NS protocol (OpNS), a modified conventional NS protocol 3
. Artifacts like rouleaux can be greatly limited by OpNS, additionally providing high contrast along with reasonably high‐resolution (near 1 nm) images of small and asymmetric proteins. These high-resolution and high contrast images are even favorable for an individual protein (a single object, no average) 3D reconstruction, such as a 160 kDa antibody, through the method of electron tomography4,5
. Moreover, OpNS can be a high‐throughput tool to examine hundreds of samples of small proteins. For example, the previously published mechanism of 53 kDa cholesteryl ester transfer protein (CETP) involved the screening and imaging of hundreds of samples 6
. Considering cryo-EM rarely successfully images proteins less than 200 kDa has yet to publish any study involving screening over one hundred sample conditions, it is fair to call OpNS a high-throughput method for studying small proteins. Hopefully the OpNS protocol presented here can be a useful tool to push the boundaries of EM and accelerate EM studies into small protein structure, dynamics and mechanisms.
Environmental Sciences, Issue 90, small and asymmetric protein structure, electron microscopy, optimized negative staining
Design and Operation of a Continuous 13C and 15N Labeling Chamber for Uniform or Differential, Metabolic and Structural, Plant Isotope Labeling
Institutions: Colorado State University, USDA-ARS, Colorado State University.
Tracing rare stable isotopes from plant material through the ecosystem provides the most sensitive information about ecosystem processes; from CO2
fluxes and soil organic matter formation to small-scale stable-isotope biomarker probing. Coupling multiple stable isotopes such as 13
C with 15
O or 2
H has the potential to reveal even more information about complex stoichiometric relationships during biogeochemical transformations. Isotope labeled plant material has been used in various studies of litter decomposition and soil organic matter formation1-4
. From these and other studies, however, it has become apparent that structural components of plant material behave differently than metabolic components (i.e
. leachable low molecular weight compounds) in terms of microbial utilization and long-term carbon storage5-7
. The ability to study structural and metabolic components separately provides a powerful new tool for advancing the forefront of ecosystem biogeochemical studies. Here we describe a method for producing 13
C and 15
N labeled plant material that is either uniformly labeled throughout the plant or differentially labeled in structural and metabolic plant components.
Here, we present the construction and operation of a continuous 13
C and 15
N labeling chamber that can be modified to meet various research needs. Uniformly labeled plant material is produced by continuous labeling from seedling to harvest, while differential labeling is achieved by removing the growing plants from the chamber weeks prior to harvest. Representative results from growing Andropogon gerardii
Kaw demonstrate the system's ability to efficiently label plant material at the targeted levels. Through this method we have produced plant material with a 4.4 atom%13
C and 6.7 atom%15
N uniform plant label, or material that is differentially labeled by up to 1.29 atom%13
C and 0.56 atom%15
N in its metabolic and structural components (hot water extractable and hot water residual components, respectively). Challenges lie in maintaining proper temperature, humidity, CO2
concentration, and light levels in an airtight 13
atmosphere for successful plant production. This chamber description represents a useful research tool to effectively produce uniformly or differentially multi-isotope labeled plant material for use in experiments on ecosystem biogeochemical cycling.
Environmental Sciences, Issue 83, 13C, 15N, plant, stable isotope labeling, Andropogon gerardii, metabolic compounds, structural compounds, hot water extraction
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Scalable Nanohelices for Predictive Studies and Enhanced 3D Visualization
Institutions: University of California Merced, University of California Merced.
Spring-like materials are ubiquitous in nature and of interest in nanotechnology for energy harvesting, hydrogen storage, and biological sensing applications. For predictive simulations, it has become increasingly important to be able to model the structure of nanohelices accurately. To study the effect of local structure on the properties of these complex geometries one must develop realistic models. To date, software packages are rather limited in creating atomistic helical models. This work focuses on producing atomistic models of silica glass (SiO2
) nanoribbons and nanosprings for molecular dynamics (MD) simulations. Using an MD model of “bulk” silica glass, two computational procedures to precisely create the shape of nanoribbons and nanosprings are presented. The first method employs the AWK programming language and open-source software to effectively carve various shapes of silica nanoribbons from the initial bulk model, using desired dimensions and parametric equations to define a helix. With this method, accurate atomistic silica nanoribbons can be generated for a range of pitch values and dimensions. The second method involves a more robust code which allows flexibility in modeling nanohelical structures. This approach utilizes a C++ code particularly written to implement pre-screening methods as well as the mathematical equations for a helix, resulting in greater precision and efficiency when creating nanospring models. Using these codes, well-defined and scalable nanoribbons and nanosprings suited for atomistic simulations can be effectively created. An added value in both open-source codes is that they can be adapted to reproduce different helical structures, independent of material. In addition, a MATLAB graphical user interface (GUI) is used to enhance learning through visualization and interaction for a general user with the atomistic helical structures. One application of these methods is the recent study of nanohelices via MD simulations for mechanical energy harvesting purposes.
Physics, Issue 93, Helical atomistic models; open-source coding; graphical user interface; visualization software; molecular dynamics simulations; graphical processing unit accelerated simulations.
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
Determination of Protein-ligand Interactions Using Differential Scanning Fluorimetry
Institutions: University of Exeter.
A wide range of methods are currently available for determining the dissociation constant between a protein and interacting small molecules. However, most of these require access to specialist equipment, and often require a degree of expertise to effectively establish reliable experiments and analyze data. Differential scanning fluorimetry (DSF) is being increasingly used as a robust method for initial screening of proteins for interacting small molecules, either for identifying physiological partners or for hit discovery. This technique has the advantage that it requires only a PCR machine suitable for quantitative PCR, and so suitable instrumentation is available in most institutions; an excellent range of protocols are already available; and there are strong precedents in the literature for multiple uses of the method. Past work has proposed several means of calculating dissociation constants from DSF data, but these are mathematically demanding. Here, we demonstrate a method for estimating dissociation constants from a moderate amount of DSF experimental data. These data can typically be collected and analyzed within a single day. We demonstrate how different models can be used to fit data collected from simple binding events, and where cooperative binding or independent binding sites are present. Finally, we present an example of data analysis in a case where standard models do not apply. These methods are illustrated with data collected on commercially available control proteins, and two proteins from our research program. Overall, our method provides a straightforward way for researchers to rapidly gain further insight into protein-ligand interactions using DSF.
Biophysics, Issue 91, differential scanning fluorimetry, dissociation constant, protein-ligand interactions, StepOne, cooperativity, WcbI.
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
High-throughput Fluorometric Measurement of Potential Soil Extracellular Enzyme Activities
Institutions: Colorado State University, Oak Ridge National Laboratory, University of Colorado.
Microbes in soils and other environments produce extracellular enzymes to depolymerize and hydrolyze organic macromolecules so that they can be assimilated for energy and nutrients. Measuring soil microbial enzyme activity is crucial in understanding soil ecosystem functional dynamics. The general concept of the fluorescence enzyme assay is that synthetic C-, N-, or P-rich substrates bound with a fluorescent dye are added to soil samples. When intact, the labeled substrates do not fluoresce. Enzyme activity is measured as the increase in fluorescence as the fluorescent dyes are cleaved from their substrates, which allows them to fluoresce. Enzyme measurements can be expressed in units of molarity or activity. To perform this assay, soil slurries are prepared by combining soil with a pH buffer. The pH buffer (typically a 50 mM sodium acetate or 50 mM Tris buffer), is chosen for the buffer's particular acid dissociation constant (pKa) to best match the soil sample pH. The soil slurries are inoculated with a nonlimiting amount of fluorescently labeled (i.e.
C-, N-, or P-rich) substrate. Using soil slurries in the assay serves to minimize limitations on enzyme and substrate diffusion. Therefore, this assay controls for differences in substrate limitation, diffusion rates, and soil pH conditions; thus detecting potential enzyme activity rates as a function of the difference in enzyme concentrations (per sample).
Fluorescence enzyme assays are typically more sensitive than spectrophotometric (i.e.
colorimetric) assays, but can suffer from interference caused by impurities and the instability of many fluorescent compounds when exposed to light; so caution is required when handling fluorescent substrates. Likewise, this method only assesses potential enzyme activities under laboratory conditions when substrates are not limiting. Caution should be used when interpreting the data representing cross-site comparisons with differing temperatures or soil types, as in situ
soil type and temperature can influence enzyme kinetics.
Environmental Sciences, Issue 81, Ecological and Environmental Phenomena, Environment, Biochemistry, Environmental Microbiology, Soil Microbiology, Ecology, Eukaryota, Archaea, Bacteria, Soil extracellular enzyme activities (EEAs), fluorometric enzyme assays, substrate degradation, 4-methylumbelliferone (MUB), 7-amino-4-methylcoumarin (MUC), enzyme temperature kinetics, soil
Diffusion Tensor Magnetic Resonance Imaging in the Analysis of Neurodegenerative Diseases
Institutions: University of Ulm.
Diffusion tensor imaging (DTI) techniques provide information on the microstructural processes of the cerebral white matter (WM) in vivo
. The present applications are designed to investigate differences of WM involvement patterns in different brain diseases, especially neurodegenerative disorders, by use of different DTI analyses in comparison with matched controls.
DTI data analysis is performed in a variate fashion, i.e.
voxelwise comparison of regional diffusion direction-based metrics such as fractional anisotropy (FA), together with fiber tracking (FT) accompanied by tractwise fractional anisotropy statistics (TFAS) at the group level in order to identify differences in FA along WM structures, aiming at the definition of regional patterns of WM alterations at the group level. Transformation into a stereotaxic standard space is a prerequisite for group studies and requires thorough data processing to preserve directional inter-dependencies. The present applications show optimized technical approaches for this preservation of quantitative and directional information during spatial normalization in data analyses at the group level. On this basis, FT techniques can be applied to group averaged data in order to quantify metrics information as defined by FT. Additionally, application of DTI methods, i.e.
differences in FA-maps after stereotaxic alignment, in a longitudinal analysis at an individual subject basis reveal information about the progression of neurological disorders. Further quality improvement of DTI based results can be obtained during preprocessing by application of a controlled elimination of gradient directions with high noise levels.
In summary, DTI is used to define a distinct WM pathoanatomy of different brain diseases by the combination of whole brain-based and tract-based DTI analysis.
Medicine, Issue 77, Neuroscience, Neurobiology, Molecular Biology, Biomedical Engineering, Anatomy, Physiology, Neurodegenerative Diseases, nuclear magnetic resonance, NMR, MR, MRI, diffusion tensor imaging, fiber tracking, group level comparison, neurodegenerative diseases, brain, imaging, clinical techniques
Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues
Institutions: University of Zurich.
Mori's Uncanny Valley Hypothesis1,2
proposes that the perception of humanlike characters such as robots and, by extension, avatars (computer-generated characters) can evoke negative or positive affect (valence) depending on the object's degree of visual and behavioral realism along a dimension of human likeness
) (Figure 1
). But studies of affective valence of subjective responses to variously realistic non-human characters have produced inconsistent findings 3, 4, 5, 6
. One of a number of reasons for this is that human likeness is not perceived as the hypothesis assumes. While the DHL can be defined following Mori's description as a smooth linear change in the degree of physical humanlike similarity, subjective perception of objects along the DHL can be understood in terms of the psychological effects of categorical perception (CP) 7
. Further behavioral and neuroimaging investigations of category processing and CP along the DHL and of the potential influence of the dimension's underlying category structure on affective experience are needed. This protocol therefore focuses on the DHL and allows examination of CP. Based on the protocol presented in the video as an example, issues surrounding the methodology in the protocol and the use in "uncanny" research of stimuli drawn from morph continua to represent the DHL are discussed in the article that accompanies the video. The use of neuroimaging and morph stimuli to represent the DHL in order to disentangle brain regions neurally responsive to physical human-like similarity from those responsive to category change and category processing is briefly illustrated.
Behavior, Issue 76, Neuroscience, Neurobiology, Molecular Biology, Psychology, Neuropsychology, uncanny valley, functional magnetic resonance imaging, fMRI, categorical perception, virtual reality, avatar, human likeness, Mori, uncanny valley hypothesis, perception, magnetic resonance imaging, MRI, imaging, clinical techniques
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Synthetic, Multi-Layer, Self-Oscillating Vocal Fold Model Fabrication
Institutions: Brigham Young University.
Sound for the human voice is produced via flow-induced vocal fold vibration. The vocal folds consist of several layers of tissue, each with differing material properties 1
. Normal voice production relies on healthy tissue and vocal folds, and occurs as a result of complex coupling between aerodynamic, structural dynamic, and acoustic physical phenomena. Voice disorders affect up to 7.5 million annually in the United States alone 2
and often result in significant financial, social, and other quality-of-life difficulties. Understanding the physics of voice production has the potential to significantly benefit voice care, including clinical prevention, diagnosis, and treatment of voice disorders.
Existing methods for studying voice production include in vivo
experimentation using human and animal subjects, in vitro
experimentation using excised larynges and synthetic models, and computational modeling. Owing to hazardous and difficult instrument access, in vivo
experiments are severely limited in scope. Excised larynx experiments have the benefit of anatomical and some physiological realism, but parametric studies involving geometric and material property variables are limited. Further, they are typically only able to be vibrated for relatively short periods of time (typically on the order of minutes).
Overcoming some of the limitations of excised larynx experiments, synthetic vocal fold models are emerging as a complementary tool for studying voice production. Synthetic models can be fabricated with systematic changes to geometry and material properties, allowing for the study of healthy and unhealthy human phonatory aerodynamics, structural dynamics, and acoustics. For example, they have been used to study left-right vocal fold asymmetry 3,4
, clinical instrument development 5
, laryngeal aerodynamics 6-9
, vocal fold contact pressure 10
, and subglottal acoustics 11
(a more comprehensive list can be found in Kniesburges et al. 12
Existing synthetic vocal fold models, however, have either been homogenous (one-layer models) or have been fabricated using two materials of differing stiffness (two-layer models). This approach does not allow for representation of the actual multi-layer structure of the human vocal folds 1
that plays a central role in governing vocal fold flow-induced vibratory response. Consequently, one- and two-layer synthetic vocal fold models have exhibited disadvantages 3,6,8
such as higher onset pressures than what are typical for human phonation (onset pressure is the minimum lung pressure required to initiate vibration), unnaturally large inferior-superior motion, and lack of a "mucosal wave" (a vertically-traveling wave that is characteristic of healthy human vocal fold vibration).
In this paper, fabrication of a model with multiple layers of differing material properties is described. The model layers simulate the multi-layer structure of the human vocal folds, including epithelium, superficial lamina propria (SLP), intermediate and deep lamina propria (i.e., ligament; a fiber is included for anterior-posterior stiffness), and muscle (i.e., body) layers 1
. Results are included that show that the model exhibits improved vibratory characteristics over prior one- and two-layer synthetic models, including onset pressure closer to human onset pressure, reduced inferior-superior motion, and evidence of a mucosal wave.
Bioengineering, Issue 58, Vocal folds, larynx, voice, speech, artificial biomechanical models
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at http://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images
Institutions: Virginia Commonwealth University, Virginia Commonwealth University Reanimation Engineering Science (VCURES) Center, Virginia Commonwealth University, Virginia Commonwealth University, Virginia Commonwealth University.
In this paper we present an automated system based mainly on the computed tomography (CT) images consisting of two main components: the midline shift estimation and intracranial pressure (ICP) pre-screening system. To estimate the midline shift, first an estimation of the ideal midline is performed based on the symmetry of the skull and anatomical features in the brain CT scan. Then, segmentation of the ventricles from the CT scan is performed and used as a guide for the identification of the actual midline through shape matching. These processes mimic the measuring process by physicians and have shown promising results in the evaluation. In the second component, more features are extracted related to ICP, such as the texture information, blood amount from CT scans and other recorded features, such as age, injury severity score to estimate the ICP are also incorporated. Machine learning techniques including feature selection and classification, such as Support Vector Machines (SVMs), are employed to build the prediction model using RapidMiner. The evaluation of the prediction shows potential usefulness of the model. The estimated ideal midline shift and predicted ICP levels may be used as a fast pre-screening step for physicians to make decisions, so as to recommend for or against invasive ICP monitoring.
Medicine, Issue 74, Biomedical Engineering, Molecular Biology, Neurobiology, Biophysics, Physiology, Anatomy, Brain CT Image Processing, CT, Midline Shift, Intracranial Pressure Pre-screening, Gaussian Mixture Model, Shape Matching, Machine Learning, traumatic brain injury, TBI, imaging, clinical techniques
A Simple Composite Phenotype Scoring System for Evaluating Mouse Models of Cerebellar Ataxia
Institutions: University of Washington, University of Washington, University of California, San Diego - Rady Children’s Hospital.
We describe a protocol for the rapid and sensitive quantification of disease severity in mouse models of cerebella ataxia. It is derived from previously published phenotype assessments in several disease models, including spinocerebellar ataxias, Huntington s disease and spinobulbar muscular atrophy. Measures include hind limb clasping, ledge test, gait and kyphosis. Each measure is recorded on a scale of 0-3, with a combined total of 0-12 for all four measures. The results effectively discriminate between affected and non-affected individuals, while also quantifying the temporal progression of neurodegenerative disease phenotypes. Measures may be analyzed individually or combined into a composite phenotype score for greater statistical power. The ideal combination of the four described measures will depend upon the disorder in question. We present an example of the protocol used to assess disease severity in a transgenic mouse model of spinocerebellar ataxia type 7 (SCA7).
Albert R. La Spada and Gwenn A. Garden contributed to this manuscript equally.
JoVE Neuroscience, Issue 39, Neurodegeneration, Mouse behavior assay, cerebellar ataxia, polyglutamine disease