Computational sequence design methods are used to engineer proteins with desired properties such as increased thermal stability and novel function. In addition, these algorithms can be used to identify an envelope of sequences that may be compatible with a particular protein fold topology. In this regard, we hypothesized that sequence-property prediction, specifically secondary structure, could be significantly enhanced by using a large database of computationally designed sequences. We performed a large-scale test of this hypothesis with 6511 diverse protein domains and 50 designed sequences per domain. After analysis of the inherent accuracy of the designed sequences database, we realized that it was necessary to put constraints on what fraction of the native sequence should be allowed to change. With mutational constraints, accuracy was improved vs. no constraints, but the diversity of designed sequences, and hence effective size of the database, was moderately reduced. Overall, the best three-state prediction accuracy (Q(3)) that we achieved was nearly a percentage point improved over using a natural sequence database alone, well below the theoretical possibility for improvement of 8-10 percentage points. Furthermore, our nascent method was used to augment the state-of-the-art PSIPRED program by a percentage point.
Surfactant protein-A (SP-A) and Toll-like receptor-4 (TLR4) proteins are recognized as pathogen-recognition receptors. An exaggerated activation of TLR4 induces inflammatory response, whereas SP-A protein down-regulates inflammation. We hypothesized that SP-A-TLR4 interaction may lead to inhibition of inflammation. In this study, we investigated interaction between native baboon lung SP-A and baboon and human TLR4-MD2 proteins by coimmunoprecipitation/immunoblotting and microwell-based methods. The interaction between SP-A and TLR4-MD2 proteins was then analyzed using a bioinformatics approach. In the in silico model of SP-A-TLR4-MD2 complex, we identified potential binding regions and amino acids at the interface of SP-A-TLR4. Using this information, we synthesized a library of human SP-A-derived peptides that contained interacting amino acids. Next, we tested whether the TLR4-interacting SP-A peptides would suppress inflammatory cytokines. The peptides were screened for any changes in the tumor necrosis factor-? (TNF-?) response against lipopolysaccharide (LPS) stimuli in the mouse JAWS II dendritic cell line. Different approaches used in this study suggested binding between SP-A and TLR4-MD2 proteins. In cells pretreated with peptides, three of seven peptides increased TNF-? production against LPS. However, two of these peptides (SPA4: GDFRYSDGTPVNYTNWYRGE and SPA5: YVGLTEGPSPGDFRYSDFTP) decreased the TNF-? production in LPS-challenged JAWS II dendritic cells; SPA4 peptide showed more pronounced inhibitory effect than SPA5 peptide. In conclusion, we identify a human SP-A-derived peptide (SPA4 peptide) that interacts with TLR4-MD2 protein and inhibits the LPS-stimulated release of TNF-? in JAWS II dendritic cells.
Most high-throughput experimental results of protein-protein interactions (PPIs) are seemingly inconsistent with each other. In this article, we re-evaluated these contradictions within the context of the underlying domain-domain interactions (DDIs) for two Escherichia coli and four Saccharomyces cerevisiae PPI datasets derived from high-throughput (yeast two-hybrid and tandem affinity purification) experimental platforms. For shared DDIs across pairs of compared datasets, we observed a remarkably high pair-wise correlation (Pearson correlation coefficient between 0.80 and 0.84) between datasets of the same organism derived from the same experimental platform. To a lesser degree, this concordance also held true for more general inter-platform and intra-species comparisons (Pearson correlation coefficient between 0.52 and 0.89). Thus, although varying experimental conditions can influence the ability of individual proteins to interact and, therefore, create apparent differences among PPIs, the physical nature of the underlying interactions, captured by DDIs, is the same and can be used to model and predict PPIs.
Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a users own high-performance computing cluster.
Protein domain prediction is often the preliminary step in both experimental and computational protein research. Here we present a new method to predict the domain boundaries of a multidomain protein from its amino acid sequence using a fuzzy mean operator. Using the nr-sequence database together with a reference protein set (RPS) containing known domain boundaries, the operator is used to assign a likelihood value for each residue of the query sequence as belonging to a domain boundary. This procedure robustly identifies contiguous boundary regions. For a dataset with a maximum sequence identity of 30%, the average domain prediction accuracy of our method is 97% for one domain proteins and 58% for multidomain proteins. The presented model is capable of using new sequence/structure information without re-parameterization after each RPS update. When tested on a current database using a four year old RPS and on a database that contains different domain definitions than those used to train the models, our method consistently yielded the same accuracy while two other published methods did not. A comparison with other domain prediction methods used in the CASP7 competition indicates that our method performs better than existing sequence-based methods.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.