Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
22 Related JoVE Articles!
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Assessment of Immunologically Relevant Dynamic Tertiary Structural Features of the HIV-1 V3 Loop Crown R2 Sequence by ab initio Folding
Institutions: School of Medicine, New York University.
The antigenic diversity of HIV-1 has long been an obstacle to vaccine design, and this variability is especially pronounced in the V3 loop of the virus' surface envelope glycoprotein. We previously proposed that the crown of the V3 loop, although dynamic and sequence variable, is constrained throughout the population of HIV-1 viruses to an immunologically relevant β-hairpin tertiary structure. Importantly, there are thousands of different V3 loop crown sequences in circulating HIV-1 viruses, making 3D structural characterization of trends across the diversity of viruses difficult or impossible by crystallography or NMR. Our previous successful studies with folding of the V3 crown1, 2
used the ab initio
accessible in the ICM-Pro molecular modeling software package (Molsoft LLC, La Jolla, CA) and suggested that the crown of the V3 loop, specifically from positions 10 to 22, benefits sufficiently from the flexibility and length of its flanking stems to behave to a large degree as if it were an unconstrained peptide freely folding in solution. As such, rapid ab initio
folding of just this portion of the V3 loop of any individual strain of the 60,000+ circulating HIV-1 strains can be informative. Here, we folded the V3 loop of the R2 strain to gain insight into the structural basis of its unique properties. R2 bears a rare V3 loop sequence thought to be responsible for the exquisite sensitivity of this strain to neutralization by patient sera and monoclonal antibodies4, 5
. The strain mediates CD4-independent infection and appears to elicit broadly neutralizing antibodies. We demonstrate how evaluation of the results of the folding can be informative for associating observed structures in the folding with the immunological activities observed for R2.
Infection, Issue 43, HIV-1, structure-activity relationships, ab initio simulations, antibody-mediated neutralization, vaccine design
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Developing Neuroimaging Phenotypes of the Default Mode Network in PTSD: Integrating the Resting State, Working Memory, and Structural Connectivity
Institutions: Alpert Medical School, Brown University, University of Georgia.
Complementary structural and functional neuroimaging techniques used to examine the Default Mode Network (DMN) could potentially improve assessments of psychiatric illness severity and provide added validity to the clinical diagnostic process. Recent neuroimaging research suggests that DMN processes may be disrupted in a number of stress-related psychiatric illnesses, such as posttraumatic stress disorder (PTSD).
Although specific DMN functions remain under investigation, it is generally thought to be involved in introspection and self-processing. In healthy individuals it exhibits greatest activity during periods of rest, with less activity, observed as deactivation, during cognitive tasks, e.g.
, working memory. This network consists of the medial prefrontal cortex, posterior cingulate cortex/precuneus, lateral parietal cortices and medial temporal regions.
Multiple functional and structural imaging approaches have been developed to study the DMN. These have unprecedented potential to further the understanding of the function and dysfunction of this network. Functional approaches, such as the evaluation of resting state connectivity and task-induced deactivation, have excellent potential to identify targeted neurocognitive and neuroaffective (functional) diagnostic markers and may indicate illness severity and prognosis with increased accuracy or specificity. Structural approaches, such as evaluation of morphometry and connectivity, may provide unique markers of etiology and long-term outcomes. Combined, functional and structural methods provide strong multimodal, complementary and synergistic approaches to develop valid DMN-based imaging phenotypes in stress-related psychiatric conditions. This protocol aims to integrate these methods to investigate DMN structure and function in PTSD, relating findings to illness severity and relevant clinical factors.
Medicine, Issue 89, default mode network, neuroimaging, functional magnetic resonance imaging, diffusion tensor imaging, structural connectivity, functional connectivity, posttraumatic stress disorder
Production of Disulfide-stabilized Transmembrane Peptide Complexes for Structural Studies
Institutions: The Walter and Eliza Hall Institute of Medical Research, The University of Melbourne.
Physical interactions among the lipid-embedded alpha-helical domains of membrane proteins play a crucial role in folding and assembly of membrane protein complexes and in dynamic processes such as transmembrane (TM) signaling and regulation of cell-surface protein levels. Understanding the structural features driving the association of particular sequences requires sophisticated biophysical and biochemical analyses of TM peptide complexes. However, the extreme hydrophobicity of TM domains makes them very difficult to manipulate using standard peptide chemistry techniques, and production of suitable study material often proves prohibitively challenging. Identifying conditions under which peptides can adopt stable helical conformations and form complexes spontaneously
adds a further level of difficulty. Here we present a procedure for the production of homo- or hetero-dimeric TM peptide complexes from materials that are expressed in E. coli
, thus allowing incorporation of stable isotope labels for nuclear magnetic resonance (NMR) or non-natural amino acids for other applications relatively inexpensively. The key innovation in this method is that TM complexes are produced and purified as covalently associated
(disulfide-crosslinked) assemblies that can form stable, stoichiometric and homogeneous structures when reconstituted into detergent, lipid or other membrane-mimetic materials. We also present carefully optimized procedures for expression and purification that are equally applicable whether producing single TM domains or crosslinked complexes and provide advice for adapting these methods to new TM sequences.
Biochemistry, Issue 73, Structural Biology, Chemistry, Chemical Engineering, Biophysics, Genetics, Molecular Biology, Membrane Proteins, Proteins, Molecular Structure, transmembrane domain, peptide chemistry, membrane protein structure, immune receptors, reversed-phase HPLC, HPLC, peptides, lipids, protein, cloning, TFA Elution, CNBr Digestion, NMR, expression, cell culture
Designing Silk-silk Protein Alloy Materials for Biomedical Applications
Institutions: Rowan University, Rowan University, Cooper Medical School of Rowan University, Rowan University.
Fibrous proteins display different sequences and structures that have been used for various applications in biomedical fields such as biosensors, nanomedicine, tissue regeneration, and drug delivery. Designing materials based on the molecular-scale interactions between these proteins will help generate new multifunctional protein alloy biomaterials with tunable properties. Such alloy material systems also provide advantages in comparison to traditional synthetic polymers due to the materials biodegradability, biocompatibility, and tenability in the body. This article used the protein blends of wild tussah silk (Antheraea pernyi
) and domestic mulberry silk (Bombyx mori
) as an example to provide useful protocols regarding these topics, including how to predict protein-protein interactions by computational methods, how to produce protein alloy solutions, how to verify alloy systems by thermal analysis, and how to fabricate variable alloy materials including optical materials with diffraction gratings, electric materials with circuits coatings, and pharmaceutical materials for drug release and delivery. These methods can provide important information for designing the next generation multifunctional biomaterials based on different protein alloys.
Bioengineering, Issue 90, protein alloys, biomaterials, biomedical, silk blends, computational simulation, implantable electronic devices
The Cell-based L-Glutathione Protection Assays to Study Endocytosis and Recycling of Plasma Membrane Proteins
Institutions: Children's Hospital of Pittsburgh of UPMC, University of Pittsburgh School of Medicine.
Membrane trafficking involves transport of proteins from the plasma membrane to the cell interior (i.e.
endocytosis) followed by trafficking to lysosomes for degradation or to the plasma membrane for recycling. The cell based L-glutathione protection assays can be used to study endocytosis and recycling of protein receptors, channels, transporters, and adhesion molecules localized at the cell surface. The endocytic assay requires labeling of cell surface proteins with a cell membrane impermeable biotin containing a disulfide bond and the N-hydroxysuccinimide (NHS) ester at 4 ºC - a temperature at which membrane trafficking does not occur. Endocytosis of biotinylated plasma membrane proteins is induced by incubation at 37 ºC. Next, the temperature is decreased again to 4 ºC to stop endocytic trafficking and the disulfide bond in biotin covalently attached to proteins that have remained at the plasma membrane is reduced with L-glutathione. At this point, only proteins that were endocytosed remain protected from L-glutathione and thus remain biotinylated. After cell lysis, biotinylated proteins are isolated with streptavidin agarose, eluted from agarose, and the biotinylated protein of interest is detected by western blotting. During the recycling assay, after biotinylation cells are incubated at 37 °C to load endocytic vesicles with biotinylated proteins and the disulfide bond in biotin covalently attached to proteins remaining at the plasma membrane is reduced with L-glutathione at 4 ºC as in the endocytic assay. Next, cells are incubated again at 37 °C to allow biotinylated proteins from endocytic vesicles to recycle to the plasma membrane. Cells are then incubated at 4 ºC, and the disulfide bond in biotin attached to proteins that recycled to the plasma membranes is reduced with L-glutathione. The biotinylated proteins protected from L-glutathione are those that did not recycle to the plasma membrane.
Basic Protocol, Issue 82, Endocytosis, recycling, plasma membrane, cell surface, EZLink, Sulfo-NHS-SS-Biotin, L-Glutathione, GSH, thiol group, disulfide bond, epithelial cells, cell polarization
Using Informational Connectivity to Measure the Synchronous Emergence of fMRI Multi-voxel Information Across Time
Institutions: University of Pennsylvania.
It is now appreciated that condition-relevant information can be present within distributed patterns of functional magnetic resonance imaging (fMRI) brain activity, even for conditions with similar levels of univariate activation. Multi-voxel pattern (MVP) analysis has been used to decode this information with great success. FMRI investigators also often seek to understand how brain regions interact in interconnected networks, and use functional connectivity (FC) to identify regions that have correlated responses over time. Just as univariate analyses can be insensitive to information in MVPs, FC may not fully characterize the brain networks that process conditions with characteristic MVP signatures. The method described here, informational connectivity (IC), can identify regions with correlated changes in MVP-discriminability across time, revealing connectivity that is not accessible to FC. The method can be exploratory, using searchlights to identify seed-connected areas, or planned, between pre-selected regions-of-interest. The results can elucidate networks of regions that process MVP-related conditions, can breakdown MVPA searchlight maps into separate networks, or can be compared across tasks and patient groups.
Neuroscience, Issue 89, fMRI, MVPA, connectivity, informational connectivity, functional connectivity, networks, multi-voxel pattern analysis, decoding, classification, method, multivariate
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
Institutions: The Feinstein Institute for Medical Research.
The scaled subprofile model (SSM)1-4
is a multivariate PCA-based algorithm that identifies major sources of variation in patient and control group brain image data while rejecting lesser components (Figure 1
). Applied directly to voxel-by-voxel covariance data of steady-state multimodality images, an entire group image set can be reduced to a few significant linearly independent covariance patterns and corresponding subject scores. Each pattern, termed a group invariant subprofile (GIS), is an orthogonal principal component that represents a spatially distributed network of functionally interrelated brain regions. Large global mean scalar effects that can obscure smaller network-specific contributions are removed by the inherent logarithmic conversion and mean centering of the data2,5,6
. Subjects express each of these patterns to a variable degree represented by a simple scalar score that can correlate with independent clinical or psychometric descriptors7,8
. Using logistic regression analysis of subject scores (i.e.
pattern expression values), linear coefficients can be derived to combine multiple principal components into single disease-related spatial covariance patterns, i.e.
composite networks with improved discrimination of patients from healthy control subjects5,6
. Cross-validation within the derivation set can be performed using bootstrap resampling techniques9
. Forward validation is easily confirmed by direct score evaluation of the derived patterns in prospective datasets10
. Once validated, disease-related patterns can be used to score individual patients with respect to a fixed reference sample, often the set of healthy subjects that was used (with the disease group) in the original pattern derivation11
. These standardized values can in turn be used to assist in differential diagnosis12,13
and to assess disease progression and treatment effects at the network level7,14-16
. We present an example of the application of this methodology to FDG PET data of Parkinson's Disease patients and normal controls using our in-house software to derive a characteristic covariance pattern biomarker of disease.
Medicine, Issue 76, Neurobiology, Neuroscience, Anatomy, Physiology, Molecular Biology, Basal Ganglia Diseases, Parkinsonian Disorders, Parkinson Disease, Movement Disorders, Neurodegenerative Diseases, PCA, SSM, PET, imaging biomarkers, functional brain imaging, multivariate spatial covariance analysis, global normalization, differential diagnosis, PD, brain, imaging, clinical techniques
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues
Institutions: University of Zurich.
Mori's Uncanny Valley Hypothesis1,2
proposes that the perception of humanlike characters such as robots and, by extension, avatars (computer-generated characters) can evoke negative or positive affect (valence) depending on the object's degree of visual and behavioral realism along a dimension of human likeness
) (Figure 1
). But studies of affective valence of subjective responses to variously realistic non-human characters have produced inconsistent findings 3, 4, 5, 6
. One of a number of reasons for this is that human likeness is not perceived as the hypothesis assumes. While the DHL can be defined following Mori's description as a smooth linear change in the degree of physical humanlike similarity, subjective perception of objects along the DHL can be understood in terms of the psychological effects of categorical perception (CP) 7
. Further behavioral and neuroimaging investigations of category processing and CP along the DHL and of the potential influence of the dimension's underlying category structure on affective experience are needed. This protocol therefore focuses on the DHL and allows examination of CP. Based on the protocol presented in the video as an example, issues surrounding the methodology in the protocol and the use in "uncanny" research of stimuli drawn from morph continua to represent the DHL are discussed in the article that accompanies the video. The use of neuroimaging and morph stimuli to represent the DHL in order to disentangle brain regions neurally responsive to physical human-like similarity from those responsive to category change and category processing is briefly illustrated.
Behavior, Issue 76, Neuroscience, Neurobiology, Molecular Biology, Psychology, Neuropsychology, uncanny valley, functional magnetic resonance imaging, fMRI, categorical perception, virtual reality, avatar, human likeness, Mori, uncanny valley hypothesis, perception, magnetic resonance imaging, MRI, imaging, clinical techniques
High Throughput Quantitative Expression Screening and Purification Applied to Recombinant Disulfide-rich Venom Proteins Produced in E. coli
Institutions: Aix-Marseille Université, Commissariat à l'énergie atomique et aux énergies alternatives (CEA) Saclay, France.
Escherichia coli (E. coli)
is the most widely used expression system for the production of recombinant proteins for structural and functional studies. However, purifying proteins is sometimes challenging since many proteins are expressed in an insoluble form. When working with difficult or multiple targets it is therefore recommended to use high throughput (HTP) protein expression screening on a small scale (1-4 ml cultures) to quickly identify conditions for soluble expression. To cope with the various structural genomics programs of the lab, a quantitative (within a range of 0.1-100 mg/L culture of recombinant protein) and HTP protein expression screening protocol was implemented and validated on thousands of proteins. The protocols were automated with the use of a liquid handling robot but can also be performed manually without specialized equipment.
Disulfide-rich venom proteins are gaining increasing recognition for their potential as therapeutic drug leads. They can be highly potent and selective, but their complex disulfide bond networks make them challenging to produce. As a member of the FP7 European Venomics project (www.venomics.eu), our challenge is to develop successful production strategies with the aim of producing thousands of novel venom proteins for functional characterization. Aided by the redox properties of disulfide bond isomerase DsbC, we adapted our HTP production pipeline for the expression of oxidized, functional venom peptides in the E. coli
cytoplasm. The protocols are also applicable to the production of diverse disulfide-rich proteins. Here we demonstrate our pipeline applied to the production of animal venom proteins. With the protocols described herein it is likely that soluble disulfide-rich proteins will be obtained in as little as a week. Even from a small scale, there is the potential to use the purified proteins for validating the oxidation state by mass spectrometry, for characterization in pilot studies, or for sensitive micro-assays.
Bioengineering, Issue 89, E. coli, expression, recombinant, high throughput (HTP), purification, auto-induction, immobilized metal affinity chromatography (IMAC), tobacco etch virus protease (TEV) cleavage, disulfide bond isomerase C (DsbC) fusion, disulfide bonds, animal venom proteins/peptides
Test Samples for Optimizing STORM Super-Resolution Microscopy
Institutions: National Physical Laboratory.
STORM is a recently developed super-resolution microscopy technique with up to 10 times better resolution than standard fluorescence microscopy techniques. However, as the image is acquired in a very different way than normal, by building up an image molecule-by-molecule, there are some significant challenges for users in trying to optimize their image acquisition. In order to aid this process and gain more insight into how STORM works we present the preparation of 3 test samples and the methodology of acquiring and processing STORM super-resolution images with typical resolutions of between 30-50 nm. By combining the test samples with the use of the freely available rainSTORM processing software it is possible to obtain a great deal of information about image quality and resolution. Using these metrics it is then possible to optimize the imaging procedure from the optics, to sample preparation, dye choice, buffer conditions, and image acquisition settings. We also show examples of some common problems that result in poor image quality, such as lateral drift, where the sample moves during image acquisition and density related problems resulting in the 'mislocalization' phenomenon.
Molecular Biology, Issue 79, Genetics, Bioengineering, Biomedical Engineering, Biophysics, Basic Protocols, HeLa Cells, Actin Cytoskeleton, Coated Vesicles, Receptor, Epidermal Growth Factor, Actins, Fluorescence, Endocytosis, Microscopy, STORM, super-resolution microscopy, nanoscopy, cell biology, fluorescence microscopy, test samples, resolution, actin filaments, fiducial markers, epidermal growth factor, cell, imaging
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
Institutions: University of Calgary , University of Calgary .
We demonstrate methods for the detection of architectural distortion in prior mammograms of interval-cancer cases based on analysis of the orientation of breast tissue patterns in mammograms. We hypothesize that architectural distortion modifies the normal orientation of breast tissue patterns in mammographic images before the formation of masses or tumors. In the initial steps of our methods, the oriented structures in a given mammogram are analyzed using Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Each detected site is then characterized using the node value, fractal dimension, and a measure of angular dispersion specifically designed to represent spiculating patterns associated with architectural distortion.
Our methods were tested with a database of 106 prior mammograms of 56 interval-cancer cases and 52 mammograms of 13 normal cases using the features developed for the characterization of architectural distortion, pattern classification via
quadratic discriminant analysis, and validation with the leave-one-patient out procedure. According to the results of free-response receiver operating characteristic analysis, our methods have demonstrated the capability to detect architectural distortion in prior mammograms, taken 15 months (on the average) before clinical diagnosis of breast cancer, with a sensitivity of 80% at about five false positives per patient.
Medicine, Issue 78, Anatomy, Physiology, Cancer Biology, angular spread, architectural distortion, breast cancer, Computer-Assisted Diagnosis, computer-aided diagnosis (CAD), entropy, fractional Brownian motion, fractal dimension, Gabor filters, Image Processing, Medical Informatics, node map, oriented texture, Pattern Recognition, phase portraits, prior mammograms, spectral analysis
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at http://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
Cross-Modal Multivariate Pattern Analysis
Institutions: University of Southern California.
Multivariate pattern analysis (MVPA) is an increasingly popular method of analyzing functional magnetic resonance imaging (fMRI) data1-4
. Typically, the method is used to identify a subject's perceptual experience from neural activity in certain regions of the brain. For instance, it has been employed to predict the orientation of visual gratings a subject perceives from activity in early visual cortices5
or, analogously, the content of speech from activity in early auditory cortices6
Here, we present an extension of the classical MVPA paradigm, according to which perceptual stimuli are not predicted within, but across sensory systems. Specifically, the method we describe addresses the question of whether stimuli that evoke memory associations in modalities other than the one through which they are presented induce content-specific activity patterns in the sensory cortices of those other modalities. For instance, seeing a muted video clip of a glass vase shattering on the ground automatically triggers in most observers an auditory image of the associated sound; is the experience of this image in the "mind's ear" correlated with a specific neural activity pattern in early auditory cortices? Furthermore, is this activity pattern distinct from the pattern that could be observed if the subject were, instead, watching a video clip of a howling dog?
In two previous studies7,8
, we were able to predict sound- and touch-implying video clips based on neural activity in early auditory and somatosensory cortices, respectively. Our results are in line with a neuroarchitectural framework proposed by Damasio9,10
, according to which the experience of mental images that are based on memories - such as hearing the shattering sound of a vase in the "mind's ear" upon seeing the corresponding video clip - is supported by the re-construction of content-specific neural activity patterns in early sensory cortices.
Neuroscience, Issue 57, perception, sensory, cross-modal, top-down, mental imagery, fMRI, MRI, neuroimaging, multivariate pattern analysis, MVPA
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images
Institutions: Virginia Commonwealth University, Virginia Commonwealth University Reanimation Engineering Science (VCURES) Center, Virginia Commonwealth University, Virginia Commonwealth University, Virginia Commonwealth University.
In this paper we present an automated system based mainly on the computed tomography (CT) images consisting of two main components: the midline shift estimation and intracranial pressure (ICP) pre-screening system. To estimate the midline shift, first an estimation of the ideal midline is performed based on the symmetry of the skull and anatomical features in the brain CT scan. Then, segmentation of the ventricles from the CT scan is performed and used as a guide for the identification of the actual midline through shape matching. These processes mimic the measuring process by physicians and have shown promising results in the evaluation. In the second component, more features are extracted related to ICP, such as the texture information, blood amount from CT scans and other recorded features, such as age, injury severity score to estimate the ICP are also incorporated. Machine learning techniques including feature selection and classification, such as Support Vector Machines (SVMs), are employed to build the prediction model using RapidMiner. The evaluation of the prediction shows potential usefulness of the model. The estimated ideal midline shift and predicted ICP levels may be used as a fast pre-screening step for physicians to make decisions, so as to recommend for or against invasive ICP monitoring.
Medicine, Issue 74, Biomedical Engineering, Molecular Biology, Neurobiology, Biophysics, Physiology, Anatomy, Brain CT Image Processing, CT, Midline Shift, Intracranial Pressure Pre-screening, Gaussian Mixture Model, Shape Matching, Machine Learning, traumatic brain injury, TBI, imaging, clinical techniques