The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
24 Related JoVE Articles!
Real Time Measurements of Membrane Protein:Receptor Interactions Using Surface Plasmon Resonance (SPR)
Institutions: The Technion-Israel Institute of Technology.
Protein-protein interactions are pivotal to most, if not all, physiological processes, and understanding the nature of such interactions is a central step in biological research. Surface Plasmon Resonance (SPR) is a sensitive detection technique for label-free study of bio-molecular interactions in real time. In a typical SPR experiment, one component (usually a protein, termed 'ligand') is immobilized onto a sensor chip surface, while the other (the 'analyte') is free in solution and is injected over the surface. Association and dissociation of the analyte from the ligand are measured and plotted in real time on a graph called a sensogram, from which pre-equilibrium and equilibrium data is derived. Being label-free, consuming low amounts of material, and providing pre-equilibrium kinetic data, often makes SPR the method of choice when studying dynamics of protein interactions. However, one has to keep in mind that due to the method's high sensitivity, the data obtained needs to be carefully analyzed, and supported by other biochemical methods.
SPR is particularly suitable for studying membrane proteins since it consumes small amounts of purified material, and is compatible with lipids and detergents. This protocol describes an SPR experiment characterizing the kinetic properties of the interaction between a membrane protein (an ABC transporter) and a soluble protein (the transporter's cognate substrate binding protein).
Structural Biology, Issue 93, ABC transporter, substrate binding protein, bio-molecular interaction kinetics, label-free, protein-protein interaction, Surface plasmon resonance (SPR), Biacore
Optimized Negative Staining: a High-throughput Protocol for Examining Small and Asymmetric Protein Structure by Electron Microscopy
Institutions: The Molecular Foundry.
Structural determination of proteins is rather challenging for proteins with molecular masses between 40 - 200 kDa. Considering that more than half of natural proteins have a molecular mass between 40 - 200 kDa1,2
, a robust and high-throughput method with a nanometer resolution capability is needed. Negative staining (NS) electron microscopy (EM) is an easy, rapid, and qualitative approach which has frequently been used in research laboratories to examine protein structure and protein-protein interactions. Unfortunately, conventional NS protocols often generate structural artifacts on proteins, especially with lipoproteins that usually form presenting rouleaux artifacts. By using images of lipoproteins from cryo-electron microscopy (cryo-EM) as a standard, the key parameters in NS specimen preparation conditions were recently screened and reported as the optimized NS protocol (OpNS), a modified conventional NS protocol 3
. Artifacts like rouleaux can be greatly limited by OpNS, additionally providing high contrast along with reasonably high‐resolution (near 1 nm) images of small and asymmetric proteins. These high-resolution and high contrast images are even favorable for an individual protein (a single object, no average) 3D reconstruction, such as a 160 kDa antibody, through the method of electron tomography4,5
. Moreover, OpNS can be a high‐throughput tool to examine hundreds of samples of small proteins. For example, the previously published mechanism of 53 kDa cholesteryl ester transfer protein (CETP) involved the screening and imaging of hundreds of samples 6
. Considering cryo-EM rarely successfully images proteins less than 200 kDa has yet to publish any study involving screening over one hundred sample conditions, it is fair to call OpNS a high-throughput method for studying small proteins. Hopefully the OpNS protocol presented here can be a useful tool to push the boundaries of EM and accelerate EM studies into small protein structure, dynamics and mechanisms.
Environmental Sciences, Issue 90, small and asymmetric protein structure, electron microscopy, optimized negative staining
Synthesis of an Intein-mediated Artificial Protein Hydrogel
Institutions: Texas A&M University, College Station, Texas A&M University, College Station.
We present the synthesis of a highly stable protein hydrogel mediated by a split-intein-catalyzed protein trans
-splicing reaction. The building blocks of this hydrogel are two protein block-copolymers each containing a subunit of a trimeric protein that serves as a crosslinker and one half of a split intein. A highly hydrophilic random coil is inserted into one of the block-copolymers for water retention. Mixing of the two protein block copolymers triggers an intein trans
-splicing reaction, yielding a polypeptide unit with crosslinkers at either end that rapidly self-assembles into a hydrogel. This hydrogel is very stable under both acidic and basic conditions, at temperatures up to 50 °C, and in organic solvents. The hydrogel rapidly reforms after shear-induced rupture. Incorporation of a "docking station peptide" into the hydrogel building block enables convenient incorporation of "docking protein"-tagged target proteins. The hydrogel is compatible with tissue culture growth media, supports the diffusion of 20 kDa molecules, and enables the immobilization of bioactive globular proteins. The application of the intein-mediated protein hydrogel as an organic-solvent-compatible biocatalyst was demonstrated by encapsulating the horseradish peroxidase enzyme and corroborating its activity.
Bioengineering, Issue 83, split-intein, self-assembly, shear-thinning, enzyme, immobilization, organic synthesis
Visualization of ATP Synthase Dimers in Mitochondria by Electron Cryo-tomography
Institutions: Max Planck Institute of Biophysics.
Electron cryo-tomography is a powerful tool in structural biology, capable of visualizing the three-dimensional structure of biological samples, such as cells, organelles, membrane vesicles, or viruses at molecular detail. To achieve this, the aqueous sample is rapidly vitrified in liquid ethane, which preserves it in a close-to-native, frozen-hydrated state. In the electron microscope, tilt series are recorded at liquid nitrogen temperature, from which 3D tomograms are reconstructed. The signal-to-noise ratio of the tomographic volume is inherently low. Recognizable, recurring features are enhanced by subtomogram averaging, by which individual subvolumes are cut out, aligned and averaged to reduce noise. In this way, 3D maps with a resolution of 2 nm or better can be obtained. A fit of available high-resolution structures to the 3D volume then produces atomic models of protein complexes in their native environment. Here we show how we use electron cryo-tomography to study the in situ
organization of large membrane protein complexes in mitochondria. We find that ATP synthases are organized in rows of dimers along highly curved apices of the inner membrane cristae, whereas complex I is randomly distributed in the membrane regions on either side of the rows. By subtomogram averaging we obtained a structure of the mitochondrial ATP synthase dimer within the cristae membrane.
Structural Biology, Issue 91, electron microscopy, electron cryo-tomography, mitochondria, ultrastructure, membrane structure, membrane protein complexes, ATP synthase, energy conversion, bioenergetics
Protein-protein Interactions Visualized by Bimolecular Fluorescence Complementation in Tobacco Protoplasts and Leaves
Institutions: Ludwig-Maximilians-Universität, München.
Many proteins interact transiently with other proteins or are integrated into multi-protein complexes to perform their biological function. Bimolecular fluorescence complementation (BiFC) is an in vivo
method to monitor such interactions in plant cells. In the presented protocol the investigated candidate proteins are fused to complementary halves of fluorescent proteins and the respective constructs are introduced into plant cells via agrobacterium-mediated transformation. Subsequently, the proteins are transiently expressed in tobacco leaves and the restored fluorescent signals can be detected with a confocal laser scanning microscope in the intact cells. This allows not only visualization of the interaction itself, but also the subcellular localization of the protein complexes can be determined. For this purpose, marker genes containing a fluorescent tag can be coexpressed along with the BiFC constructs, thus visualizing cellular structures such as the endoplasmic reticulum, mitochondria, the Golgi apparatus or the plasma membrane. The fluorescent signal can be monitored either directly in epidermal leaf cells or in single protoplasts, which can be easily isolated from the transformed tobacco leaves. BiFC is ideally suited to study protein-protein interactions in their natural surroundings within the living cell. However, it has to be considered that the expression has to be driven by strong promoters and that the interaction partners are modified due to fusion of the relatively large fluorescence tags, which might interfere with the interaction mechanism. Nevertheless, BiFC is an excellent complementary approach to other commonly applied methods investigating protein-protein interactions, such as coimmunoprecipitation, in vitro
pull-down assays or yeast-two-hybrid experiments.
Plant Biology, Issue 85, Tetratricopeptide repeat domain, chaperone, chloroplasts, endoplasmic reticulum, HSP90, Toc complex, Sec translocon, BiFC
Scalable Nanohelices for Predictive Studies and Enhanced 3D Visualization
Institutions: University of California Merced, University of California Merced.
Spring-like materials are ubiquitous in nature and of interest in nanotechnology for energy harvesting, hydrogen storage, and biological sensing applications. For predictive simulations, it has become increasingly important to be able to model the structure of nanohelices accurately. To study the effect of local structure on the properties of these complex geometries one must develop realistic models. To date, software packages are rather limited in creating atomistic helical models. This work focuses on producing atomistic models of silica glass (SiO2
) nanoribbons and nanosprings for molecular dynamics (MD) simulations. Using an MD model of “bulk” silica glass, two computational procedures to precisely create the shape of nanoribbons and nanosprings are presented. The first method employs the AWK programming language and open-source software to effectively carve various shapes of silica nanoribbons from the initial bulk model, using desired dimensions and parametric equations to define a helix. With this method, accurate atomistic silica nanoribbons can be generated for a range of pitch values and dimensions. The second method involves a more robust code which allows flexibility in modeling nanohelical structures. This approach utilizes a C++ code particularly written to implement pre-screening methods as well as the mathematical equations for a helix, resulting in greater precision and efficiency when creating nanospring models. Using these codes, well-defined and scalable nanoribbons and nanosprings suited for atomistic simulations can be effectively created. An added value in both open-source codes is that they can be adapted to reproduce different helical structures, independent of material. In addition, a MATLAB graphical user interface (GUI) is used to enhance learning through visualization and interaction for a general user with the atomistic helical structures. One application of these methods is the recent study of nanohelices via MD simulations for mechanical energy harvesting purposes.
Physics, Issue 93, Helical atomistic models; open-source coding; graphical user interface; visualization software; molecular dynamics simulations; graphical processing unit accelerated simulations.
Super-resolution Imaging of Neuronal Dense-core Vesicles
Institutions: Lewis & Clark College, Lewis & Clark College, Oregon Health & Science University, Lewis & Clark College.
Detection of fluorescence provides the foundation for many widely utilized and rapidly advancing microscopy techniques employed in modern biological and medical applications. Strengths of fluorescence include its sensitivity, specificity, and compatibility with live imaging. Unfortunately, conventional forms of fluorescence microscopy suffer from one major weakness, diffraction-limited resolution in the imaging plane, which hampers studies of structures with dimensions smaller than ~250 nm. Recently, this limitation has been overcome with the introduction of super-resolution fluorescence microscopy techniques, such as photoactivated localization microscopy (PALM). Unlike its conventional counterparts, PALM can produce images with a lateral resolution of tens of nanometers. It is thus now possible to use fluorescence, with its myriad strengths, to elucidate a spectrum of previously inaccessible attributes of cellular structure and organization.
Unfortunately, PALM is not trivial to implement, and successful strategies often must be tailored to the type of system under study. In this article, we show how to implement single-color PALM studies of vesicular structures in fixed, cultured neurons. PALM is ideally suited to the study of vesicles, which have dimensions that typically range from ~50-250 nm. Key steps in our approach include labeling neurons with photoconvertible (green to red) chimeras of vesicle cargo, collecting sparsely sampled raw images with a super-resolution microscopy system, and processing the raw images to produce a high-resolution PALM image. We also demonstrate the efficacy of our approach by presenting exceptionally well-resolved images of dense-core vesicles (DCVs) in cultured hippocampal neurons, which refute the hypothesis that extrasynaptic trafficking of DCVs is mediated largely by DCV clusters.
Neuroscience, Issue 89, photoactivated localization microscopy, Dendra2, dense-core vesicle, synapse, hippocampal neuron, cluster
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
Determination of Protein-ligand Interactions Using Differential Scanning Fluorimetry
Institutions: University of Exeter.
A wide range of methods are currently available for determining the dissociation constant between a protein and interacting small molecules. However, most of these require access to specialist equipment, and often require a degree of expertise to effectively establish reliable experiments and analyze data. Differential scanning fluorimetry (DSF) is being increasingly used as a robust method for initial screening of proteins for interacting small molecules, either for identifying physiological partners or for hit discovery. This technique has the advantage that it requires only a PCR machine suitable for quantitative PCR, and so suitable instrumentation is available in most institutions; an excellent range of protocols are already available; and there are strong precedents in the literature for multiple uses of the method. Past work has proposed several means of calculating dissociation constants from DSF data, but these are mathematically demanding. Here, we demonstrate a method for estimating dissociation constants from a moderate amount of DSF experimental data. These data can typically be collected and analyzed within a single day. We demonstrate how different models can be used to fit data collected from simple binding events, and where cooperative binding or independent binding sites are present. Finally, we present an example of data analysis in a case where standard models do not apply. These methods are illustrated with data collected on commercially available control proteins, and two proteins from our research program. Overall, our method provides a straightforward way for researchers to rapidly gain further insight into protein-ligand interactions using DSF.
Biophysics, Issue 91, differential scanning fluorimetry, dissociation constant, protein-ligand interactions, StepOne, cooperativity, WcbI.
Metabolomic Analysis of Rat Brain by High Resolution Nuclear Magnetic Resonance Spectroscopy of Tissue Extracts
Institutions: Aix-Marseille Université, Aix-Marseille Université.
Studies of gene expression on the RNA and protein levels have long been used to explore biological processes underlying disease. More recently, genomics and proteomics have been complemented by comprehensive quantitative analysis of the metabolite pool present in biological systems. This strategy, termed metabolomics, strives to provide a global characterization of the small-molecule complement involved in metabolism. While the genome and the proteome define the tasks cells can perform, the metabolome is part of the actual phenotype. Among the methods currently used in metabolomics, spectroscopic techniques are of special interest because they allow one to simultaneously analyze a large number of metabolites without prior selection for specific biochemical pathways, thus enabling a broad unbiased approach. Here, an optimized experimental protocol for metabolomic analysis by high-resolution NMR spectroscopy is presented, which is the method of choice for efficient quantification of tissue metabolites. Important strengths of this method are (i) the use of crude extracts, without the need to purify the sample and/or separate metabolites; (ii) the intrinsically quantitative nature of NMR, permitting quantitation of all metabolites represented by an NMR spectrum with one reference compound only; and (iii) the nondestructive nature of NMR enabling repeated use of the same sample for multiple measurements. The dynamic range of metabolite concentrations that can be covered is considerable due to the linear response of NMR signals, although metabolites occurring at extremely low concentrations may be difficult to detect. For the least abundant compounds, the highly sensitive mass spectrometry method may be advantageous although this technique requires more intricate sample preparation and quantification procedures than NMR spectroscopy. We present here an NMR protocol adjusted to rat brain analysis; however, the same protocol can be applied to other tissues with minor modifications.
Neuroscience, Issue 91, metabolomics, brain tissue, rodents, neurochemistry, tissue extracts, NMR spectroscopy, quantitative metabolite analysis, cerebral metabolism, metabolic profile
Designing Silk-silk Protein Alloy Materials for Biomedical Applications
Institutions: Rowan University, Rowan University, Cooper Medical School of Rowan University, Rowan University.
Fibrous proteins display different sequences and structures that have been used for various applications in biomedical fields such as biosensors, nanomedicine, tissue regeneration, and drug delivery. Designing materials based on the molecular-scale interactions between these proteins will help generate new multifunctional protein alloy biomaterials with tunable properties. Such alloy material systems also provide advantages in comparison to traditional synthetic polymers due to the materials biodegradability, biocompatibility, and tenability in the body. This article used the protein blends of wild tussah silk (Antheraea pernyi
) and domestic mulberry silk (Bombyx mori
) as an example to provide useful protocols regarding these topics, including how to predict protein-protein interactions by computational methods, how to produce protein alloy solutions, how to verify alloy systems by thermal analysis, and how to fabricate variable alloy materials including optical materials with diffraction gratings, electric materials with circuits coatings, and pharmaceutical materials for drug release and delivery. These methods can provide important information for designing the next generation multifunctional biomaterials based on different protein alloys.
Bioengineering, Issue 90, protein alloys, biomaterials, biomedical, silk blends, computational simulation, implantable electronic devices
Simultaneous Multicolor Imaging of Biological Structures with Fluorescence Photoactivation Localization Microscopy
Institutions: University of Maine.
Localization-based super resolution microscopy can be applied to obtain a spatial map (image) of the distribution of individual fluorescently labeled single molecules within a sample with a spatial resolution of tens of nanometers. Using either photoactivatable (PAFP) or photoswitchable (PSFP) fluorescent proteins fused to proteins of interest, or organic dyes conjugated to antibodies or other molecules of interest, fluorescence photoactivation localization microscopy (FPALM) can simultaneously image multiple species of molecules within single cells. By using the following approach, populations of large numbers (thousands to hundreds of thousands) of individual molecules are imaged in single cells and localized with a precision of ~10-30 nm. Data obtained can be applied to understanding the nanoscale spatial distributions of multiple protein types within a cell. One primary advantage of this technique is the dramatic increase in spatial resolution: while diffraction limits resolution to ~200-250 nm in conventional light microscopy, FPALM can image length scales more than an order of magnitude smaller. As many biological hypotheses concern the spatial relationships among different biomolecules, the improved resolution of FPALM can provide insight into questions of cellular organization which have previously been inaccessible to conventional fluorescence microscopy. In addition to detailing the methods for sample preparation and data acquisition, we here describe the optical setup for FPALM. One additional consideration for researchers wishing to do super-resolution microscopy is cost: in-house setups are significantly cheaper than most commercially available imaging machines. Limitations of this technique include the need for optimizing the labeling of molecules of interest within cell samples, and the need for post-processing software to visualize results. We here describe the use of PAFP and PSFP expression to image two protein species in fixed cells. Extension of the technique to living cells is also described.
Basic Protocol, Issue 82, Microscopy, Super-resolution imaging, Multicolor, single molecule, FPALM, Localization microscopy, fluorescent proteins
Mechanical Stimulation-induced Calcium Wave Propagation in Cell Monolayers: The Example of Bovine Corneal Endothelial Cells
Institutions: KU Leuven.
Intercellular communication is essential for the coordination of physiological processes between cells in a variety of organs and tissues, including the brain, liver, retina, cochlea and vasculature. In experimental settings, intercellular Ca2+
-waves can be elicited by applying a mechanical stimulus to a single cell. This leads to the release of the intracellular signaling molecules IP3
that initiate the propagation of the Ca2+
-wave concentrically from the mechanically stimulated cell to the neighboring cells. The main molecular pathways that control intercellular Ca2+
-wave propagation are provided by gap junction channels through the direct transfer of IP3
and by hemichannels through the release of ATP. Identification and characterization of the properties and regulation of different connexin and pannexin isoforms as gap junction channels and hemichannels are allowed by the quantification of the spread of the intercellular Ca2+
-wave, siRNA, and the use of inhibitors of gap junction channels and hemichannels. Here, we describe a method to measure intercellular Ca2+
-wave in monolayers of primary corneal endothelial cells loaded with Fluo4-AM in response to a controlled and localized mechanical stimulus provoked by an acute, short-lasting deformation of the cell as a result of touching the cell membrane with a micromanipulator-controlled glass micropipette with a tip diameter of less than 1 μm. We also describe the isolation of primary bovine corneal endothelial cells and its use as model system to assess Cx43-hemichannel activity as the driven force for intercellular Ca2+
-waves through the release of ATP. Finally, we discuss the use, advantages, limitations and alternatives of this method in the context of gap junction channel and hemichannel research.
Cellular Biology, Issue 77, Molecular Biology, Medicine, Biomedical Engineering, Biophysics, Immunology, Ophthalmology, Gap Junctions, Connexins, Connexin 43, Calcium Signaling, Ca2+, Cell Communication, Paracrine Communication, Intercellular communication, calcium wave propagation, gap junctions, hemichannels, endothelial cells, cell signaling, cell, isolation, cell culture
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
Institutions: University of Calgary , University of Calgary .
We demonstrate methods for the detection of architectural distortion in prior mammograms of interval-cancer cases based on analysis of the orientation of breast tissue patterns in mammograms. We hypothesize that architectural distortion modifies the normal orientation of breast tissue patterns in mammographic images before the formation of masses or tumors. In the initial steps of our methods, the oriented structures in a given mammogram are analyzed using Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Each detected site is then characterized using the node value, fractal dimension, and a measure of angular dispersion specifically designed to represent spiculating patterns associated with architectural distortion.
Our methods were tested with a database of 106 prior mammograms of 56 interval-cancer cases and 52 mammograms of 13 normal cases using the features developed for the characterization of architectural distortion, pattern classification via
quadratic discriminant analysis, and validation with the leave-one-patient out procedure. According to the results of free-response receiver operating characteristic analysis, our methods have demonstrated the capability to detect architectural distortion in prior mammograms, taken 15 months (on the average) before clinical diagnosis of breast cancer, with a sensitivity of 80% at about five false positives per patient.
Medicine, Issue 78, Anatomy, Physiology, Cancer Biology, angular spread, architectural distortion, breast cancer, Computer-Assisted Diagnosis, computer-aided diagnosis (CAD), entropy, fractional Brownian motion, fractal dimension, Gabor filters, Image Processing, Medical Informatics, node map, oriented texture, Pattern Recognition, phase portraits, prior mammograms, spectral analysis
The Use of Reverse Phase Protein Arrays (RPPA) to Explore Protein Expression Variation within Individual Renal Cell Cancers
Institutions: University of Edinburgh, University of St Andrews, University of Edinburgh, University of Edinburgh, Western General Hospital, University of Edinburgh, Queen Mary University of London.
Currently there is no curative treatment for metastatic clear cell renal cell cancer, the commonest variant of the disease. A key factor in this treatment resistance is thought to be the molecular complexity of the disease 1
. Targeted therapy such as the tyrosine kinase inhibitor (TKI)-sunitinib have been utilized, but only 40% of patients will respond, with the overwhelming majority of these patients relapsing within 1 year 2
. As such the question of intrinsic and acquired resistance in renal cell cancer patients is highly relevant 3
In order to study resistance to TKIs, with the ultimate goal of developing effective, personalized treatments, sequential tissue after a specific period of targeted therapy is required, an approach which had proved successful in chronic myeloid leukaemia 4
. However the application of such a strategy in renal cell carcinoma is complicated by the high level of both inter- and intratumoral heterogeneity, which is a feature of renal cell carcinoma5,6
as well as other solid tumors 7
. Intertumoral heterogeneity due to transcriptomic and genetic differences is well established even in patients with similar presentation, stage and grade of tumor. In addition it is clear that there is great morphological (intratumoral) heterogeneity in RCC, which is likely to represent even greater molecular heterogeneity. Detailed mapping and categorization of RCC tumors by combined morphological analysis and Fuhrman grading allows the selection of representative areas for proteomic analysis.
Protein based analysis of RCC8
is attractive due to its widespread availability in pathology laboratories; however, its application can be problematic due to the limited availability of specific antibodies 9
. Due to the dot blot nature of the Reverse Phase Protein Arrays (RPPA), antibody specificity must be pre-validated; as such strict quality control of antibodies used is of paramount importance. Despite this limitation the dot blot format does allow assay miniaturization, allowing for the printing of hundreds of samples onto a single nitrocellulose slide. Printed slides can then be analyzed in a similar fashion to Western analysis with the use of target specific primary antibodies and fluorescently labelled secondary antibodies, allowing for multiplexing. Differential protein expression across all the samples on a slide can then be analyzed simultaneously by comparing the relative level of fluorescence in a more cost-effective and high-throughput manner.
Cancer Biology, Issue 71, Bioengineering, Medicine, Biomedical Engineering, Cellular Biology, Molecular Biology, Genetics, Pathology, Oncology, Proteins, Early Detection of Cancer, Translational Medical Research, RPPA, RCC, Heterogeneity, Proteomics, Tumor Grade, intertumoral, tumor, metastatic, carcinoma, renal cancer, clear cell renal cell cancer, cancer, assay
Assessment of Immunologically Relevant Dynamic Tertiary Structural Features of the HIV-1 V3 Loop Crown R2 Sequence by ab initio Folding
Institutions: School of Medicine, New York University.
The antigenic diversity of HIV-1 has long been an obstacle to vaccine design, and this variability is especially pronounced in the V3 loop of the virus' surface envelope glycoprotein. We previously proposed that the crown of the V3 loop, although dynamic and sequence variable, is constrained throughout the population of HIV-1 viruses to an immunologically relevant β-hairpin tertiary structure. Importantly, there are thousands of different V3 loop crown sequences in circulating HIV-1 viruses, making 3D structural characterization of trends across the diversity of viruses difficult or impossible by crystallography or NMR. Our previous successful studies with folding of the V3 crown1, 2
used the ab initio
accessible in the ICM-Pro molecular modeling software package (Molsoft LLC, La Jolla, CA) and suggested that the crown of the V3 loop, specifically from positions 10 to 22, benefits sufficiently from the flexibility and length of its flanking stems to behave to a large degree as if it were an unconstrained peptide freely folding in solution. As such, rapid ab initio
folding of just this portion of the V3 loop of any individual strain of the 60,000+ circulating HIV-1 strains can be informative. Here, we folded the V3 loop of the R2 strain to gain insight into the structural basis of its unique properties. R2 bears a rare V3 loop sequence thought to be responsible for the exquisite sensitivity of this strain to neutralization by patient sera and monoclonal antibodies4, 5
. The strain mediates CD4-independent infection and appears to elicit broadly neutralizing antibodies. We demonstrate how evaluation of the results of the folding can be informative for associating observed structures in the folding with the immunological activities observed for R2.
Infection, Issue 43, HIV-1, structure-activity relationships, ab initio simulations, antibody-mediated neutralization, vaccine design
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images
Institutions: Virginia Commonwealth University, Virginia Commonwealth University Reanimation Engineering Science (VCURES) Center, Virginia Commonwealth University, Virginia Commonwealth University, Virginia Commonwealth University.
In this paper we present an automated system based mainly on the computed tomography (CT) images consisting of two main components: the midline shift estimation and intracranial pressure (ICP) pre-screening system. To estimate the midline shift, first an estimation of the ideal midline is performed based on the symmetry of the skull and anatomical features in the brain CT scan. Then, segmentation of the ventricles from the CT scan is performed and used as a guide for the identification of the actual midline through shape matching. These processes mimic the measuring process by physicians and have shown promising results in the evaluation. In the second component, more features are extracted related to ICP, such as the texture information, blood amount from CT scans and other recorded features, such as age, injury severity score to estimate the ICP are also incorporated. Machine learning techniques including feature selection and classification, such as Support Vector Machines (SVMs), are employed to build the prediction model using RapidMiner. The evaluation of the prediction shows potential usefulness of the model. The estimated ideal midline shift and predicted ICP levels may be used as a fast pre-screening step for physicians to make decisions, so as to recommend for or against invasive ICP monitoring.
Medicine, Issue 74, Biomedical Engineering, Molecular Biology, Neurobiology, Biophysics, Physiology, Anatomy, Brain CT Image Processing, CT, Midline Shift, Intracranial Pressure Pre-screening, Gaussian Mixture Model, Shape Matching, Machine Learning, traumatic brain injury, TBI, imaging, clinical techniques
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at http://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
Predicting the Effectiveness of Population Replacement Strategy Using Mathematical Modeling
Institutions: University of California, Los Angeles.
Charles Taylor and John Marshall explain the utility of mathematical modeling for evaluating the effectiveness of population replacement strategy. Insight is given into how computational models can provide information on the population dynamics of mosquitoes and the spread of transposable elements through A. gambiae subspecies. The ethical considerations of releasing genetically modified mosquitoes into the wild are discussed.
Cellular Biology, Issue 5, mosquito, malaria, popuulation, replacement, modeling, infectious disease