PoSSuM (http://possum.cbrc.jp/PoSSuM/) is a database for detecting similar small-molecule binding sites on proteins. Since its initial release in 2011, PoSSuM has grown to provide information related to 49 million pairs of similar binding sites discovered among 5.5 million known and putative binding sites. This enlargement of the database is expected to enhance opportunities for biological and pharmaceutical applications, such as predictions of new functions and drug discovery. In this release, we have provided a new service named PoSSuM drug search (PoSSuMds) at http://possum.cbrc.jp/PoSSuM/drug_search/, in which we selected 194 approved drug compounds retrieved from ChEMBL, and detected their known binding pockets and pockets that are similar to them. Users can access and download all of the search results via a new web interface, which is useful for finding ligand analogs as well as potential target proteins. Furthermore, PoSSuMds enables users to explore the binding pocket universe within PoSSuM. Additionally, we have improved the web interface with new functions, including sortable tables and a viewer for visualizing and downloading superimposed pockets.
Brefeldin A-inhibited guanine nucleotide-exchange protein 3 (BIG3) has been identified recently as a novel regulator of estrogen signalling in breast cancer cells. Despite being a potential target for new breast cancer treatment, its amino acid sequence suggests no association with any well-characterized protein family and provides little clues as to its molecular function. In this paper, we predicted the structure, function and interactions of BIG3 using a range of bioinformatic tools.
In the second antibody modeling assessment, we used a semiautomated template-based structure modeling approach for 11 blinded antibody variable region (Fv) targets. The structural modeling method involved several steps, including template selection for framework and canonical structures of complementary determining regions (CDRs), homology modeling, energy minimization, and expert inspection. The submitted models for Fv modeling in Stage 1 had the lowest average backbone root mean square deviation (RMSD) (1.06 Å). Comparison to crystal structures showed the most accurate Fv models were generated for 4 out of 11 targets. We found that the successful modeling in Stage 1 mainly was due to expert-guided template selection for CDRs, especially for CDR-H3, based on our previously proposed empirical method (H3-rules) and the use of position specific scoring matrix-based scoring. Loop refinement using fragment assembly and multicanonical molecular dynamics (McMD) was applied to CDR-H3 loop modeling in Stage 2. Fragment assembly and McMD produced putative structural ensembles with low free energy values that were scored based on the OSCAR all-atom force field and conformation density in principal component analysis space, respectively, as well as the degree of consensus between the two sampling methods. The quality of 8 out of 10 targets improved as compared with Stage 1. For 4 out of 10 Stage-2 targets, our method generated top-scoring models with RMSD values of less than 1 Å. In this article, we discuss the strengths and weaknesses of our approach as well as possible directions for improvement to generate better predictions in the future.
High-dose ionizing radiation induces severe DNA damage in the epithelial stem cells in small intestinal crypts and causes gastrointestinal syndrome (GIS). Although the tumour suppressor p53 is a primary factor inducing death of crypt cells with DNA damage, its essential role in maintaining genome stability means inhibiting p53 to prevent GIS is not a viable strategy. Here we show that the innate immune receptor Toll-like receptor 3 (TLR3) is critical for the pathogenesis of GIS. Tlr3(-/-) mice show substantial resistance to GIS owing to significantly reduced radiation-induced crypt cell death. Despite showing reduced crypt cell death, p53-dependent crypt cell death is not impaired in Tlr3(-/-) mice. p53-dependent crypt cell death causes leakage of cellular RNA, which induces extensive cell death via TLR3. An inhibitor of TLR3-RNA binding ameliorates GIS by reducing crypt cell death. Thus, we propose blocking TLR3 activation as a novel approach to treat GIS.
The bacterial cell-division protein FtsA anchors FtsZ to the cytoplasmic membrane. But how FtsA and FtsZ interact during membrane division remains obscure. We have solved 2.2 Å resolution crystal structure for FtsA from Staphylococcus aureus. In the crystals, SaFtsA molecules within the dimer units are twisted, in contrast to the straight filament of FtsA from Thermotoga maritima, and the half of S12-S13 hairpin regions are disordered. We confirmed that SaFtsZ and SaFtsA associate in vitro, and found that SaFtsZ GTPase activity is enhanced by interaction with SaFtsA.
Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs.
Dextran sulfate (DS) is a negatively charged sulfated polysaccharide that suppresses the replication of influenza A viruses. The suppression was thought to be associated with inhibition of the hemagglutinin-dependent fusion activity. However, we previously showed that suppression by DS was observed not only at the initial stage of viral infection, but also later when virus is released from infected cells due to inhibition of neuraminidase (NA) activity. In the present study, we isolated DS-resistant A/Puerto Rico/8/34 (PR8) influenza viruses and analyzed the inhibition by DS. We found six mutations in NA genes of five independent resistant PR8 viruses and each resistant NA gene had two mutations. All mutations were from basic to acidic or neutral amino acids. In addition, R430L, K432E or K435E in the 430-435 region was a common mutation in all resistant NA genes. To determine which amino acid(s) are responsible for this resistance, a panel of recombinant viruses containing a PR8 and A/WSN/33(WSN) chimeric NA gene or an NA gene with different mutation(s) was generated using reverse genetics. Using recombinant viruses containing a PR8/WSN chimeric NA, we showed that one third of the C-terminal region of PR8 NA was responsible for DS-sensitivity. Recombinant viruses with a single mutation in NA replicated better than wild-type PR8 in the presence of DS, but were still DS-sensitive. However, replication of recombinant viruses with double mutations from the resistant viruses was not affected by the presence or absence of DS. In addition, resistant recombinant viruses were found to be sensitive to the NA inhibitor, oseltamivir and the oseltamivir-resistant recombinant virus was sensitive to DS. These results suggested that DS is an NA inhibitor with a different mechanism of action from the currently used NA inhibitors and that DS could be used in combination with these inhibitors to treat influenza virus infections.
Prioritising candidate genes for further experimental characterisation is an essential, yet challenging task in biomedical research. One way of achieving this goal is to identify specific biological themes that are enriched within the gene set of interest to obtain insights into the biological phenomena under study. Biological pathway data have been particularly useful in identifying functional associations of genes and/or gene sets. However, biological pathway information as compiled in varied repositories often differs in scope and content, preventing a more effective and comprehensive characterisation of gene sets. Here we describe a new approach to constructing biologically coherent gene sets from pathway data in major public repositories and employing them for functional analysis of large gene sets. We first revealed significant overlaps in gene content between different pathways and then defined a clustering method based on the shared gene content and the similarity of gene overlap patterns. We established the biological relevance of the constructed pathway clusters using independent quantitative measures and we finally demonstrated the effectiveness of the constructed pathway clusters in comparative functional enrichment analysis of gene sets associated with diverse human diseases gathered from the literature. The pathway clusters and gene mappings have been integrated into the TargetMine data warehouse and are likely to provide a concise, manageable and biologically relevant means of functional analysis of gene sets and to facilitate candidate gene prioritisation.
Determining enzyme functions is essential for a thorough understanding of cellular processes. Although many prediction methods have been developed, it remains a significant challenge to predict enzyme functions at the fourth-digit level of the Enzyme Commission numbers. Functional specificity of enzymes often changes drastically by mutations of a small number of residues and therefore, information about these critical residues can potentially help discriminate detailed functions. However, because these residues must be identified by mutagenesis experiments, the available information is limited, and the lack of experimentally verified specificity determining residues (SDRs) has hindered the development of detailed function prediction methods and computational identification of SDRs. Here we present a novel method for predicting enzyme functions by random forests, EFPrf, along with a set of putative SDRs, the random forests derived SDRs (rf-SDRs). EFPrf consists of a set of binary predictors for enzymes in each CATH superfamily and the rf-SDRs are the residue positions corresponding to the most highly contributing attributes obtained from each predictor. EFPrf showed a precision of 0.98 and a recall of 0.89 in a cross-validated benchmark assessment. The rf-SDRs included many residues, whose importance for specificity had been validated experimentally. The analysis of the rf-SDRs revealed both a general tendency that functionally diverged superfamilies tend to include more active site residues in their rf-SDRs than in less diverged superfamilies, and superfamily-specific conservation patterns of each functional residue. EFPrf and the rf-SDRs will be an effective tool for annotating enzyme functions and for understanding how enzyme functions have diverged within each superfamily.
In early stage drug development, it is desirable to assess the toxicity of compounds as quickly as possible. Biomarker genes can help predict whether a candidate drug will adversely affect a given individual, but they are often difficult to discover. In addition, the mechanism of toxicity of many drugs and common compounds is not yet well understood. The Japanese Toxicogenomics Project provides a large database of systematically collected microarray samples from rats (liver, kidney and primary hepatocytes) and human cells (primary hepatocytes) after exposure to 170 different compounds in different dosages and at different time intervals. However, until now, no intuitive user interface has been publically available, making it time consuming and difficult for individual researchers to explore the data.
FtsA from methicillin-resistant Staphylococcus aureus (MRSA) was cloned, overexpressed and purified. The protein was crystallized using the sitting-drop vapour-diffusion technique. A cocrystal with ?-?-imidoadenosine 5-phosphate (AMPPNP; a nonhydrolysable ATP analogue) was grown using PEG 3350 as a precipitant at 293 K. X-ray diffraction data were collected to a resolution of 2.3 Å at 100 K. The crystal belonged to the monoclinic space group P2?, with unit-cell parameters a = 75.31, b = 102.78, c = 105.90 Å, ? = 96.54°. The calculated Matthews coefficient suggested that the asymmetric unit contained three or four monomers.
Hepatitis C virus (HCV) is a major cause of chronic liver disease. HCV NS5A protein plays an important role in HCV infection through its interactions with other HCV proteins and host factors. In an attempt to further our understanding of the biological context of protein interactions between NS5A and host factors in HCV pathogenesis, we generated an extensive physical interaction map between NS5A and cellular factors. By combining a yeast two-hybrid assay with comprehensive literature mining, we built the NS5A interactome composed of 132 human proteins that interact with NS5A. These interactions were integrated into a high-confidence human protein interactome (HPI) with the help of the TargetMine data warehouse system to infer an overall protein interaction map linking NS5A with the components of the host cellular networks. The NS5A-host interactions that were integrated with the HPI were shown to participate in compact and well-connected cellular networks. Functional analysis of the NS5A "infection" network using TargetMine highlighted cellular pathways associated with immune system, cellular signaling, cell adhesion, cellular growth and death among others, which were significantly targeted by NS5A-host interactions. In addition, cellular assays with in vitro HCV cell culture systems identified two ER-localized host proteins RTN1 and RTN3 as novel regulators of HCV propagation. Our analysis builds upon the present understanding of the role of NS5A protein in HCV pathogenesis and provides potential targets for more effective anti-HCV therapeutic intervention.
5,6-Dimethylxanthenone-4-acetic acid (DMXAA), a potent type I interferon (IFN) inducer, was evaluated as a chemotherapeutic agent in mouse cancer models and proved to be well tolerated in human cancer clinical trials. Despite its multiple biological functions, DMXAA has not been fully characterized for the potential application as a vaccine adjuvant. In this report, we show that DMXAA does act as an adjuvant due to its unique property as a soluble innate immune activator. Using OVA as a model antigen, DMXAA was demonstrated to improve on the antigen specific immune responses and induce a preferential Th2 (Type-2) response. The adjuvant effect was directly dependent on the IRF3-mediated production of type-I-interferon, but not IL-33. DMXAA could also enhance the immunogenicity of influenza split vaccine which led to significant increase in protective responses against live influenza virus challenge in mice compared to split vaccine alone. We propose that DMXAA can be used as an adjuvant that targets a specific innate immune signaling pathway via IRF3 for potential applications including vaccines against influenza which requires a high safety profile.
We present, to our knowledge, the first quantitative analysis of functional site diversity in homologous domain superfamilies. Different types of functional sites are considered separately. Our results show that most diverse superfamilies are very plastic in terms of the spatial location of their functional sites. This is especially true for protein-protein interfaces. In contrast, we confirm that catalytic sites typically occupy only a very small number of topological locations. Small-ligand binding sites are more diverse than expected, although in a more limited manner than protein-protein interfaces. In spite of the observed diversity, our results also confirm the previously reported preferential location of functional sites. We identify a subset of homologous domain superfamilies where diversity is particularly extreme, and discuss possible reasons for such plasticity, i.e. structural diversity. Our results do not contradict previous reports of preferential co-location of sites among homologues, but rather point at the importance of not ignoring other sites, especially in large and diverse superfamilies. Data on sites exploited by different relatives, within each well annotated domain superfamily, has been made accessible from the CATH website in order to highlight versatile superfamilies or superfamilies with highly preferential sites. This information is valuable for system biology and knowledge of any constraints on protein interactions could help in understanding the dynamic control of networks in which these proteins participate. The novelty of our work lies in the comprehensive nature of the analysis - we have used a significantly larger dataset than previous studies - and the fact that in many superfamilies we show that different parts of the domain surface are exploited by different relatives for ligand/protein interactions, particularly in superfamilies which are diverse in sequence and structure, an observation not previously reported on such a large scale. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.
The acquisition of endocrine resistance is a common obstacle in endocrine therapy of patients with oestrogen receptor-? (ER?)-positive breast tumours. We previously demonstrated that the BIG3-PHB2 complex has a crucial role in the modulation of oestrogen/ER? signalling in breast cancer cells. Here we report a cell-permeable peptide inhibitor, called ERAP, that regulates multiple ER?-signalling pathways associated with tamoxifen resistance in breast cancer cells by inhibiting the interaction between BIG3 and PHB2. Intrinsic PHB2 released from BIG3 by ERAP directly binds to both nuclear- and membrane-associated ER?, which leads to the inhibition of multiple ER?-signalling pathways, including genomic and non-genomic ER? activation and ER? phosphorylation, and the growth of ER?-positive breast cancer cells both in vitro and in vivo. More importantly, ERAP treatment suppresses tamoxifen resistance and enhances tamoxifen responsiveness in ER?-positive breast cancer cells. These findings suggest inhibiting the interaction between BIG3 and PHB2 may be a new therapeutic strategy for the treatment of luminal-type breast cancer.
We previously demonstrated that though the human SAA1 gene shows no typical STAT3 response element (STAT3-RE) in its promoter region, STAT3 and the nuclear factor (NF-?B) p65 first form a complex following interleukin IL-1 and IL-6 (IL-1+6) stimulation, after which STAT3 interacts with a region downstream of the NF-?B RE in the SAA1 promoter. In this study, we employed a computational approach based on indirect read outs of protein-DNA contacts to identify a set of candidates for non-consensus STAT3 transcription factor binding sites (TFBSs). The binding of STAT3 to one of the predicted non-consensus TFBSs was experimentally confirmed through a dual luciferase assay and DNA affinity chromatography. The present study defines a novel STAT3 non-consensus TFBS at nt -75/-66 downstream of the NF-?B RE in the SAA1 promoter region that is required for NF-?B p65 and STAT3 to activate SAA1 transcription in human HepG2 liver cells. Our analysis builds upon the current understanding of STAT3 function, suggesting a wider array of mechanisms of STAT3 function in inflammatory response, and provides a useful framework for investigating novel TF-target associations with potential therapeutic implications.
Regulation of gene expression, protein synthesis, replication and assembly of many viruses involve RNA-protein interactions. Although some successful computational tools have been reported to recognize RNA binding sites in proteins, the problem of specificity remains poorly investigated. After the nucleotide base composition, the dinucleotide is the smallest unit of RNA sequence information and many RNA-binding proteins simply bind to regions enriched in one dinucleotide. Interaction preferences of protein subsequences and dinucleotides can be inferred from protein-RNA complex structures, enabling a training-based prediction approach.
Computational prediction of residues that participate in protein-protein interactions is a difficult task, and state of the art methods have shown only limited success in this arena. One possible problem with these methods is that they try to predict interacting residues without incorporating information about the partner protein, although it is unclear how much partner information could enhance prediction performance. To address this issue, the two following comparisons are of crucial significance: (a) comparison between the predictability of inter-protein residue pairs, i.e., predicting exactly which residue pairs interact with each other given two protein sequences; this can be achieved by either combining conventional single-protein predictions or making predictions using a new model trained directly on the residue pairs, and the performance of these two approaches may be compared: (b) comparison between the predictability of the interacting residues in a single protein (irrespective of the partner residue or protein) from conventional methods and predictions converted from the pair-wise trained model. Using these two streams of training and validation procedures and employing similar two-stage neural networks, we showed that the models trained on pair-wise contacts outperformed the partner-unaware models in predicting both interacting pairs and interacting single-protein residues. Prediction performance decreased with the size of the conformational change upon complex formation; this trend is similar to docking, even though no structural information was used in our prediction. An example application that predicts two partner-specific interfaces of a protein was shown to be effective, highlighting the potential of the proposed approach. Finally, a preliminary attempt was made to score docking decoy poses using prediction of interacting residue pairs; this analysis produced an encouraging result.
Protein-lipid interactions play essential roles in the conformational stability and biological functions of membrane proteins. However, few of the previous computational studies have taken into account the atomic details of protein-lipid interactions explicitly.
Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.
The uptake carrier organic anion-transporting polypeptide 1B3 (OATP1B3, gene SLCO1B3) is involved in the hepatic clearance of xenobiotics including statins, taxanes, and mycophenolic acid. We thought to assess the SLCO1B3 coding region for yet unidentified polymorphisms and to analyze their functional relevance.
The high levels of sequence diversity and rapid rates of evolution of HIV-1 represent the main challenges for developing effective therapies. However, there are constraints imposed by the three-dimensional protein structure that affect the sequence space accessible to the evolution of HIV-1. Here, we present a strategy for predicting the set of possible amino acid replacements in HIV. Our approach is based on the identification of likely amino acid changes in the context of these structural constraints using environment-specific substitution matrices as well as considering the physical constraints imposed by local structure. Assessment of the power of various published algorithms in predicting the evolution of HIV-1 Gag P17 shows that it is possible to use these methods to make accurate predictions of the sequence diversity. Our own method, SubFit, uses knowledge of local structural constraints; it achieves similar prediction success with the best-performing methods. We also show that erroneous predictions are largely due to infrequently occurring amino acids that will probably have severe fitness costs for the protein. Future improvements; for example, incorporating covariation and immunological constraints will permit more reliable prediction of viral evolution.
Hepatitis C virus (HCV) is a major cause of chronic liver disease worldwide. Here we attempt to further our understanding of the biological context of protein interactions in HCV pathogenesis, by investigating interactions between HCV proteins Core and NS4B and human host proteins. Using the yeast two-hybrid (Y2H) membrane protein system, eleven human host proteins interacting with Core and 45 interacting with NS4B were identified, most of which are novel. These interactions were used to infer overall protein interaction maps linking the viral proteins with components of the host cellular networks. Core and NS4B proteins contribute to highly compact interaction networks that may enable the virus to respond rapidly to host physiological responses to HCV infection. Analysis of the interaction networks highlighted enriched biological pathways likely influenced in HCV infection. Inspection of individual interactions offered further insights into the possible mechanisms that permit HCV to evade the host immune response and appropriate host metabolic machinery. Follow-up cellular assays with cell lines infected with HCV genotype 1b and 2a strains validated Core interacting proteins ENO1 and SLC25A5 and host protein PXN as novel regulators of HCV replication and viral production. ENO1 siRNA knockdown was found to inhibit HCV replication in both the HCV genotypes and viral RNA release in genotype 2a. PXN siRNA inhibition was observed to inhibit replication specifically in genotype 1b but not in genotype 2a, while SLC25A5 siRNA facilitated a minor increase in the viral RNA release in genotype 2a. Thus, our analysis can provide potential targets for more effective anti-HCV therapeutic intervention.
Ketopatoate reductase (KPR) is the second enzyme in the pantothenate (vitamin B(5)) biosynthesis pathway, an essential metabolic pathway identified as a potential target for new antimicrobials. The sequence similarity among putative KPRs is limited and KPR itself belongs to a large superfamily of 6-phosphogluconate dehydrogenases. Therefore, it is necessary to discriminate between true and other enzymes. In this paper, we describe a systematic analysis of putative KPRs in the context of this superfamily. Detailed structural analysis allowed us to define key residues for KPR activity and we classified eight structural genomics structures of the KPR family into four functional subclasses. We proposed a semi-automatic protocol, using sequence-structure homology recognition scores, for assigning KPR and related proteins to these subclasses and applied it to a representative set of 103 completely sequenced bacterial genomes. A similar approach can be applied to other enzyme families, which would aid the correct identification of drug targets and help design novel specific inhibitors.
The evolution of protein folds is under strong constraints from their surrounding environment. Although folding in water-soluble proteins is driven primarily by hydrophobic forces, the nature of the forces that determine the folding and stability of transmembrane proteins are still not fully understood. Furthermore, the chemically heterogeneous lipid bilayer has a non-uniform effect on protein structure. In this article, we attempt to get an insight into the nature of this effect by examining the impact of various types of local structure environment on amino acid substitution, based on alignments of high-resolution structures of polytopic helical transmembrane proteins combined with sequences of close homologs. Compared to globular proteins, burying amino acid sidechains, especially hydrophilic ones, led to a lower increase in conservation in both the lipid-water interface region and the hydrocarbon core region. This observation is due to surface residues in HTM proteins especially in the HC region being relatively highly conserved, suggesting higher evolutionary constraints from their specific interactions with the surrounding lipid molecules. Polar and small residues, particularly Pro and Gly, show a noticeable increase in conservation as they are positioned more towards the centre of the membrane, which is consistent with their recognized key roles in structural stability. In addition, the examination of hydrogen bonds in the membrane environment identified some exposed hydrophilic residues being better conserved when not hydrogen-bonded to other residues, supporting the importance of lipid-protein sidechain interactions. The conclusions presented in this study highlight the distinct features of substitution matrices that take into account the membrane environment, and their potential role in improving sequence-structure alignments of transmembrane proteins.
To investigate the relationships between functional subclasses and sequence and structural information contained in the active-site and ligand-binding residues (LBRs), we performed a detailed analysis of seven diverse enzyme superfamilies: aldolase class I, TIM-barrel glycosidases, alpha/beta-hydrolases, P-loop containing nucleotide triphosphate hydrolases, collagenase, Zn peptidases, and glutamine phosphoribosylpyrophosphate, subunit 1, domain 1. These homologous superfamilies, as defined in CATH, were selected from the enzyme catalytic-mechanism database. We defined active-site and LBRs based solely on the literature information and complex structures in the Protein Data Bank. From a structure-based multiple sequence alignment for each CATH homologous superfamily, we extracted subsequences consisting of the aligned positions that were used as an active-site or a ligand-binding site by at least one sequence. Using both the subsequences and full-length alignments, we performed cluster analysis with three sequence distance measures. We showed that the cluster analysis using the subsequences was able to detect functional subclasses more accurately than the clustering using the full-length alignments. The subsequences determined by only the literature information and complex structures, thus, had sufficient information to detect the functional subclasses. Detailed examination of the clustering results provided new insights into the mechanism of functional diversification for these superfamilies.
The limited availability of protein structures often restricts the functional annotation of proteins and the identification of their protein-protein interaction sites. Computational methods to identify interaction sites from protein sequences alone are, therefore, required for unraveling the functions of many proteins. This article describes a new method (PSIVER) to predict interaction sites, i.e. residues binding to other proteins, in protein sequences. Only sequence features (position-specific scoring matrix and predicted accessibility) are used for training a Naïve Bayes classifier (NBC), and conditional probabilities of each sequence feature are estimated using a kernel density estimation method (KDE).
Many structural properties such as solvent accessibility, dihedral angles and helix-helix contacts can be assigned to each residue in a membrane protein. Independent studies exist on the analysis and sequence-based prediction of some of these so-called one-dimensional features. However, there is little explanation of why certain residues are predicted in a wrong structural class or with large errors in the absolute values of these features. On the other hand, membrane proteins undergo conformational changes to allow transport as well as ligand binding. These conformational changes often occur via residues that are inherently flexible and hence, predicting fluctuations in residue positions is of great significance.
Conserved residues forming tightly packed clusters have been shown to be energy hot spots in both protein-protein and protein-DNA complexes. A number of analyses on these clusters of conserved residues (CCRs) have been reported, all pointing to a crucial role that these clusters play in protein function, especially protein-protein and protein-DNA interactions. However, currently there is no publicly available tool to automatically detect such clusters. Here, we present a web server that takes a coordinate file in PDB format as input and automatically executes all the steps to identify CCRs in protein structures. In addition, it calculates the structural properties of each residue and of the CCRs. We also present statistics to show that CCRs, determined by these procedures, are significantly enriched in hot spots in protein-protein and protein-RNA complexes, which supplements our more detailed similar results on protein-DNA complexes. We expect that CCRXP web server will be useful in studies of protein structures and their interactions and selecting mutagenesis targets. The web server can be accessed at http://ccrxp.netasa.org.
Sequence dependence of solvent accessibility in globular and membrane proteins is well established. However, this important structural property has been poorly investigated in nucleic acids. On the other hand investigation of structural determinants of transcriptional and post-transcriptional processes in gene expression are also in a primitive stage and there is a need to explore novel sequence and structural features of both DNA and RNA, which may explain both basic and regulatory mechanisms at various stages of expression. We have recently shown that the nucleotide accessibility in double-stranded DNA molecules strongly depends on sequence context and can be predicted using neighbor information. In this work, we investigate statistics, neighbor-dependence and predictability of nucleotide solvent accessibility for various types of RNA molecules (single-stranded, double-stranded, protein-unbound and protein-bound). It was found that average solvent accessibility of different RNA trinucleotides varies considerably. Interestingly, important translational signals (initiatory AUG codon, Shine-Dalgharno site) were characterized by high solvent accessibility that could be important for its selection in evolution. We also analyzed a relationship between nucleotide accessibility and synonymous codon usage bias in some genomes and find that the two properties are directly related. We believe that the analysis and prediction of nucleotide solvent accessibility opens new avenues to explore more biologically meaningful relationship between RNA structure and function.
A newly identified family of NAD-dependent D-2-hydroxyacid dehydrogenases (D-2-HydDHs) catalyzes the stereo-specific reduction of branched-chain 2-keto acids with bulky hydrophobic side chains to 2-hydroxyacids. They are promising targets for industrial/practical applications, particularly in the stereo-specific synthesis of C3-branched D-hydroxyacids. Comparative modeling and docking studies have been performed to build models of the enzyme-cofactor-substrate complexes and identify key residues for cofactor and substrate recognition. To explore large conformational transitions (domain motions), a normal mode analysis was employed using a simple potential and the protein models. Our analysis suggests that the new D-2-HydDH family members possess the N-terminal NAD(H) binding Rossmann-fold domain and the alpha-helical C-terminal substrate binding domain. A hinge bending motion between the N- and C-terminal domains was predicted, which would trigger the switch of the conserved essential Lys to form a key hydrogen bond with the C2 ketone of the 2-keto acid substrates. Our findings will be useful for site-directed mutagenesis studies and protein engineering.
The cytokine lymphotoxin-alpha (LT alpha) activates various biological functions through its three receptor subtypes, tumor necrosis factor receptor 1 (TNFR1), TNFR2 and herpes virus entry mediator (HVEM), but the relative contribution of each receptor to each function is unclear. Therefore it is important to create mutant LT alpha with receptor selectivity for optimized cancer therapy and the analysis of receptor function. Here, we attempted to create a lysine-deficient mutant LT alpha with TNFR1-selective bioactivity using a phage display technique. We obtained the TNFR1-selective mutant LT alpha R1selLT, which contained the mutations K19N, K28Q, K39S, K84Q, K89V, and K119H. Compared with wild-type LT alpha (wtLT alpha), R1selLT showed several-fold higher bioactivity via TNFR1 but 40-fold lower bioactivity via TNFR2. Kinetic association-dissociation parameters of R1selLT with TNFR2 were higher than those of wtLT alpha, whereas these parameters of R1selLT with TNFR1 were lower than those of wtLT alpha, suggesting that destabilization of the R1selLT-TNFR2 complex causes the decreased bioactivity of R1selLT on TNFR2. We also showed that the K84Q mutation contributed to the enhanced activity via TNFR1, and K39S lowered activity via TNFR2. R1selLT likely will be useful in cancer therapy and in analysis of the LT alpha structure-function relationship.
DNA recognition by proteins is one of the most important processes in living systems. Therefore, understanding the recognition process in general, and identifying mutual recognition sites in proteins and DNA in particular, carries great significance. The sequence and structural dependence of DNA-binding sites in proteins has led to the development of successful machine learning methods for their prediction. However, all existing machine learning methods predict DNA-binding sites, irrespective of their target sequence and hence, none of them is helpful in identifying specific protein-DNA contacts. In this work, we formulate the problem of predicting specific DNA-binding sites in terms of contacts between the residue environments of proteins and the identity of a mononucleotide or a dinucleotide step in DNA. The aim of this work is to take a protein sequence or structural features as inputs and predict for each amino acid residue if it binds to DNA at locations identified by one of the four possible mononucleotides or one of the 10 unique dinucleotide steps. Contact predictions are made at various levels of resolution viz. in terms of side chain, backbone and major or minor groove atoms of DNA.
Heat shock factor 2 (HSF2) is a member of a vertebrate transcription factor family for genes of heat shock proteins and is involved in the regulation of development and cellular differentiation. The DNA binding property of HSF2 is modulated by the post-translational modification of a specific lysine residue in its DNA binding domain by small ubiquitin-like modifier (SUMO), but the consequences of SUMOylation and its underlying molecular mechanism remain unclear. Here we show the inhibitory effect of SUMOylation on the interaction between HSF2 and DNA based on biochemical analysis using isolated recombinant HSF2. NMR study of the SUMOylated DNA binding domain of HSF2 indicates that the SUMO moiety is flexible with respect to the DNA binding domain and has neither a noncovalent interface with nor a structural effect on the domain. Combined with data from double electron-electron resonance and paramagnetic NMR relaxation enhancement experiments, these results suggest that SUMO attachment negatively modulates the formation of the protein-DNA complex through a randomly distributed steric interference.
We present a statistical analysis of residue environment preferences along the membrane normal in helical transmembrane (HTM) proteins, based on an up-to-date nonredundant set of protein structures. Distinct amino acid residue propensities were revealed, both in terms of lipid accessibility and depth within the lipid bilayer, highlighting their potential usefulness for alignment and modelling of membrane proteins. Using the propensities in the HTM proteins, new lipophobicity scales (LIPS) were derived for the lipid bilayer interface (LI) and the hydrocarbon core (HC) regions of the membrane, measuring the tendencies of different amino acids to occupy protein-buried or lipid-exposed positions. The LIPS for LI and HC resemble some of the existing LIPS such as kPROT, TMLIP2, and LA but our new scales were derived by using more comprehensive information than any of the existing scales and are distinct overall. Effective free energies of transfer derived from the LIPS showed a good correlation with a semi empirical scale for the transfer energies from the interface of palmitoyloleoylphosphocholine (POPC) bilayers to ocatanol (Delta WW(ioct)). The new scales also predicted the lipophobic effect in the LI to be smaller than the hydrophobic effect governing the folding of globular proteins, consistent with theory and experiment. These results provided a coherent description of lipophobicity in the distinct layers of the membrane and gave clarity to the widely discussed notion of whether membrane proteins can be regarded as "inside out" of globular proteins.
HCV is a major cause of chronic liver disease worldwide and is a formidable therapeutic challenge. Recently, Diamond et al. analyzed the proteomic profiles of liver samples from HCV-positive liver transplant recipients, supplemented with an independent metabolite analysis. They used a computational approach, which highlighted the enriched functional themes and topological attributes associated with the protein association network based on their clinical data and suggested a crucial role of oxidative stress in fibrosis progression in HCV infection. Their findings provide new insights into the mechanisms that regulate the progression of HCV-associated liver fibrosis, which may be useful for identification of suitable biomarkers to evaluate the onset and severity of hepatic fibrosis and the development of new therapeutic and anti-HCV strategies.
In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database.
Proteins interact with different partners to perform different functions and it is important to elucidate the determinants of partner specificity in protein complex formation. Although methods for detecting specificity determining positions have been developed previously, direct experimental evidence for these amino acid residues is scarce, and the lack of information has prevented further computational studies. In this article, we constructed a dataset that is likely to exhibit specificity in protein complex formation, based on available crystal structures and several intuitive ideas about interaction profiles and functional subclasses. We then defined a "structure-based specificity determining position (sbSDP)" as a set of equivalent residues in a protein family showing a large variation in their interaction energy with different partners. We investigated sequence and structural features of sbSDPs and demonstrated that their amino acid propensities significantly differed from those of other interacting residues and that the importance of many of these residues for determining specificity had been verified experimentally.
Stat3 mediates a complex spectrum of cellular responses, including inflammation, cell proliferation, and apoptosis. Although evidence exists in support of a positive role for Stat3 in cancer, its role has remained somewhat controversial because of insufficient study of how its genetic deletion may affect carcinogenesis in various tissues. In this study, we show using epithelium-specific knockout mice (Stat3(?/?)) that Stat3 blunts rather than supports antitumor immunity in carcinogen-induced lung tumorigenesis. Although Stat3(?/?) mice did not show any lung defects in terms of proliferation, apoptosis, or angiogenesis, they exhibited reduced urethane-induced tumorigenesis and increased antitumor inflammation and natural killer (NK) cell immunity. Comparative microarray analysis revealed an increase in Stat3(?/?) tumors in proinflammatory chemokine production and a decrease in MHC class I antigen expression associated with NK cell recognition. Consistent with these findings, human non-small cell lung cancer (NSCLC) cells in which Stat3 was silenced displayed an enhancement of proinflammatory chemokine production, reduced expression of MHC class I antigen, and increased susceptibility to NK cell-mediated cytotoxicity. In addition, supernatants from Stat3-silenced NSCLC cells promoted monocyte migration. Collectively, our findings argue that Stat3 exerts an inhibitory effect on antitumor NK cell immunity in the setting of carcinogen-induced tumorigenesis.
Hepatitis C virus (HCV) causes chronic liver disease worldwide. HCV Core protein (Core) forms the viral capsid and is crucial for HCV pathogenesis and HCV-induced hepatocellular carcinoma, through its interaction with the host factor proteasome activator PA28?. Here, using BD-PowerBlot high-throughput Western array, we attempt to further investigate HCV pathogenesis by comparing the protein levels in liver samples from Core-transgenic mice with or without the knockout of PA28? expression (abbreviated PA28?(-/-)CoreTG and CoreTG, respectively) against the wild-type (WT). The differentially expressed proteins integrated into the human interactome were shown to participate in compact and well-connected cellular networks. Functional analysis of the interaction networks using a newly developed data warehouse system highlighted cellular pathways associated with vesicular transport, immune system, cellular adhesion, and cell growth and death among others that were prominently influenced by Core and PA28? in HCV infection. Follow-up assays with in vitro HCV cell culture systems validated VTI1A, a vesicular transport associated factor, which was upregulated in CoreTG but not in PA28?(-/-)CoreTG, as a novel regulator of HCV release but not replication. Our analysis provided novel insights into the Core-PA28? interplay in HCV pathogenesis and identified potential targets for better anti-HCV therapy and potentially novel biomarkers of HCV infection.
Toxin-antitoxin systems are widespread in bacteria and archaea. They perform diverse functional roles, including the generation of persistence, maintenance of genetic loci and resistance to bacteriophages through abortive infection. Toxin-antitoxin systems have been divided into three types, depending on the nature of the interacting macromolecules. The recently discovered Type III toxin-antitoxin systems encode protein toxins that are inhibited by pseudoknots of antitoxic RNA, encoded by short tandem repeats upstream of the toxin gene. Recent studies have identified the range of Type I and Type II systems within current sequence databases. Here, structure-based homology searches were combined with iterative protein sequence comparisons to obtain a current picture of the prevalence of Type III systems. Three independent Type III families were identified, according to toxin sequence similarity. The three families were found to be far more abundant and widespread than previously known, with examples throughout the Firmicutes, Fusobacteria and Proteobacteria. Functional assays confirmed that representatives from all three families act as toxin-antitoxin loci within Escherichia coli and at least two of the families confer resistance to bacteriophages. This study shows that active Type III toxin-antitoxin systems are far more diverse than previously known, and suggests that more remain to be identified.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.