T cell receptors (TCRs) can recognize diverse lipid and metabolite antigens presented by MHC-like molecules CD1 and MR1, and the molecular basis of many of these interactions has not been determined. Here we applied our protein docking algorithm TCRFlexDock, previously developed to perform docking of TCRs to peptide-MHC (pMHC) molecules, to predict the binding of ?? and ?? TCRs to CD1 and MR1, starting with the structures of the unbound molecules.
To evade host immune mechanisms, many bacteria secrete immunomodulatory enzymes. Streptococcus pyogenes, one of the most common human pathogens, secretes a large endoglycosidase, EndoS, which removes carbohydrates in a highly specific manner from IgG antibodies. This modification renders antibodies incapable of eliciting host effector functions through either complement or Fc ? receptors, providing the bacteria with a survival advantage. On account of this antibody-specific modifying activity, EndoS is being developed as a promising injectable therapeutic for autoimmune diseases that rely on autoantibodies. Additionally, EndoS is a key enzyme used in the chemoenzymatic synthesis of homogenously glycosylated antibodies with tailored Fc ? receptor-mediated effector functions. Despite the tremendous utility of this enzyme, the molecular basis of EndoS specificity for, and processing of, IgG antibodies has remained poorly understood. Here, we report the X-ray crystal structure of EndoS and provide a model of its encounter complex with its substrate, the IgG1 Fc domain. We show that EndoS is composed of five distinct protein domains, including glycosidase, leucine-rich repeat, hybrid Ig, carbohydrate binding module, and three-helix bundle domains, arranged in a distinctive V-shaped conformation. Our data suggest that the substrate enters the concave interior of the enzyme structure, is held in place by the carbohydrate binding module, and that concerted conformational changes in both enzyme and substrate are required for subsequent antibody deglycosylation. The EndoS structure presented here provides a framework from which novel endoglycosidases could be engineered for additional clinical and biotechnological applications.
Protein-protein interactions are essential to cellular and immune function, and in many cases, because of the absence of an experimentally determined structure of the complex, these interactions must be modeled to obtain an understanding of their molecular basis. We present a user-friendly protein docking server, based on the rigid-body docking programs ZDOCK and M-ZDOCK, to predict structures of protein-protein complexes and symmetric multimers. With a goal of providing an accessible and intuitive interface, we provide options for users to guide the scoring and the selection of output models, in addition to dynamic visualization of input structures and output docking models. This server enables the research community to easily and quickly produce structural models of protein-protein complexes and symmetric multimers for their own analysis.
T cell receptors (TCRs) are key to antigen-specific immunity and are increasingly being explored as therapeutics, most visibly in cancer immunotherapy. As TCRs typically possess only low-to-moderate affinity for their peptide/MHC (pMHC) ligands, there is a recognized need to develop affinity-enhanced TCR variants. Previous in vitro engineering efforts have yielded remarkable improvements in TCR affinity, yet concerns exist about the maintenance of peptide specificity and the biological impacts of ultra-high affinity. As opposed to in vitro engineering, computational design can directly address these issues, in theory permitting the rational control of peptide specificity together with relatively controlled increments in affinity. Here we explored the efficacy of computational design with the clinically relevant TCR DMF5, which recognizes nonameric and decameric epitopes from the melanoma-associated Melan-A/MART-1 protein presented by the class I MHC HLA-A2. We tested multiple mutations selected by flexible and rigid modeling protocols, assessed impacts on affinity and specificity, and utilized the data to examine and improve algorithmic performance. We identified multiple mutations that improved binding affinity, and characterized the structure, affinity, and binding kinetics of a previously reported double mutant that exhibits an impressive 400-fold affinity improvement for the decameric pMHC ligand without detectable binding to non-cognate ligands. The structure of this high affinity mutant indicated very little conformational consequences and emphasized the high fidelity of our modeling procedure. Overall, our work showcases the capability of computational design to generate TCRs with improved pMHC affinities while explicitly accounting for peptide specificity, as well as its potential for generating TCRs with customized antigen targeting capabilities.
Chronic hepatitis C virus (HCV) infection is the most common cause of end-stage liver disease, often leading to liver transplantation, in which case circulating virions typically infect the transplanted liver within hours and viral concentrations can quickly exceed pre-transplant levels. MBL-HCV1 is a fully human monoclonal antibody recognizing a linear epitope of the HCV E2 envelope glycoprotein (amino acids 412-423). The ability of MBL-HCV1 to prevent HCV recurrence after liver transplantation was investigated in a phase 2 randomized clinical trial evaluating six MBL-HCV1-treated subjects and five placebo-treated subjects. MBL-HCV1 treatment significantly delayed time to viral rebound compared with placebo treatment. Here we report results from high-throughput sequencing on the serum of each of the eleven enrolled subjects prior to liver transplantation and after viral rebound. We further sequenced the sera of the MBL-HCV1-treated subjects at various interim time points to study the evolution of antibody-resistant viral variants. We detected mutations at one of two positions within the antibody epitope--mutations of N at position 415 to D, K or S, or mutation of N at position 417 to S. It has been previously reported that N415 is not glycosylated in the wild-type E2 protein, but N417S can lead to glycosylation at position 415. Thus N415 is a key position for antibody recognition and the only routes we identified for viral escape, within the constraints of HCV fitness in vivo, involve mutating or glycosylating this position. Evaluation of mutations along the entire E1 and E2 proteins revealed additional positions that changed moderately before and after MBL-HCV1 treatment for subsets of the six subjects, yet underscored the relative importance of position 415 in MBL-HCV1 resistance.
Conformational entropy is an important component of protein-protein interactions; however, there is no reliable method for computing this parameter. We have developed a statistical measure of residual backbone entropy in folded proteins by using the ?-? distributions of the twenty amino acids in common secondary structures. The backbone entropy patterns of amino acids within helix, sheet or coil form clusters that recapitulate the branching and hydrogen bonding properties of the side-chains in the secondary structure type. The same types of residues in coil and sheet have identical backbone entropies, while helix residues have much smaller conformational entropies. We estimated the backbone entropy change for immunoglobulin Complimentarily Determining Regions (CDRs) from the crystal structures of 34 low affinity TCRs and 40 high affinity Fabs as a result of forming protein complexes. Surprisingly, we discovered that the computed backbone entropy loss of only the CDR3, but not all CDRs, correlated significantly with the kinetic and affinity constants of the 74 selected complexes. Consequently, we propose a simple algorithm to introduce proline mutations that restrict the conformational flexibility of CDRs and enhance the kinetics and affinity of immunoglobulin interactions. Combining the proline mutations with rationally designed mutants from a previous study led to 2,400-fold increase in the affinity of the A6 TCR for Tax-HLAA2. However, this mutational scheme failed to induce significant binding changes in the already high affinity C225-Fab/huEGFR interface. Our results will serve as a roadmap to formulate more effective target functions to design immune complexes with improved biological functions.
T cell receptors (TCRs) are immune proteins that specifically bind to antigenic molecules, which are often foreign peptides presented by major histocompatibility complex proteins (pMHCs), playing a key role in the cellular immune response. To advance our understanding and modeling of this dynamic immunological event, we assembled a protein-protein docking benchmark consisting of 20 structures of crystallized TCR/pMHC complexes for which unbound structures exist for both TCR and pMHC. We used our benchmark to compare predictive performance using several flexible and rigid backbone TCR/pMHC docking protocols. Our flexible TCR docking algorithm, TCRFlexDock, improved predictive success over the fixed backbone protocol, leading to near-native predictions for 80% of the TCR/pMHC cases among the top 10 models, and 100% of the cases in the top 30 models. We then applied TCRFlexDock to predict the two distinct docking modes recently described for a single TCR bound to two different antigens, and tested several protein modeling scoring functions for prediction of TCR/pMHC binding affinities. This algorithm and benchmark should enable future efforts to predict, and design of uncharacterized TCR/pMHC complexes.
We compared the performance of template-free (docking) and template-based methods for the prediction of protein-protein complex structures. We found similar performance for a template-based method based on threading (COTH) and another template-based method based on structural alignment (PRISM). The template-based methods showed similar performance to a docking method (ZDOCK) when the latter was allowed one prediction for each complex, but when the same number of predictions was allowed for each method, the docking approach outperformed template-based approaches. We identified strengths and weaknesses in each method. Template-based approaches were better able to handle complexes that involved conformational changes upon binding. Furthermore, the threading-based and docking methods were better than the structural-alignment-based method for enzyme-inhibitor complex prediction. Finally, we show that the near-native (correct) predictions were generally not shared by the various approaches, suggesting that integrating their results could be the superior strategy.
Community-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing. The most successful methods explicitly considered the effects of mutation on monomer stability in addition to binding affinity, carried out explicit side-chain sampling and backbone relaxation, evaluated packing, electrostatic, and solvation effects, and correctly identified around a third of the beneficial mutations. Much room for improvement remains for even the best techniques, and large-scale fitness landscapes should continue to provide an excellent test bed for continued evaluation of both existing and new prediction methodologies.
Type 1 diabetes (T1D) is a T cell-mediated disease. It is strongly associated with susceptibility haplotypes within the major histocompatibility complex, but this association accounts for an estimated 50% of susceptibility. Other studies have identified as many as 50 additional susceptibility loci, but the effect of most is very modest (odds ratio (OR) <1.5). What accounts for the "missing heritability" is unknown and is often attributed to environmental factors. Here we review new data on the cognate ligand of MHC molecules, the T cell receptor (TCR). In rats, we found that one allele of a TCR variable gene, V ? 13A, is strongly associated with T1D (OR >5) and that deletion of V ? 13+ T cells prevents diabetes. A role for the TCR is also suspected in NOD mice, but TCR regions have not been associated with human T1D. To investigate this disparity, we tested the hypothesis in silico that previous studies of human T1D genetics were underpowered to detect MHC-contingent TCR susceptibility. We show that stratifying by MHC markedly increases statistical power to detect potential TCR susceptibility alleles. We suggest that the TCR regions are viable candidates for T1D susceptibility genes, could account for "missing heritability," and could be targets for prevention.
Computational prediction of the 3D structures of molecular interactions is a challenging area, often requiring significant computational resources to produce structural predictions with atomic-level accuracy. This can be particularly burdensome when modeling large sets of interactions, macromolecular assemblies, or interactions between flexible proteins. We previously developed a protein docking program, ZDOCK, which uses a fast Fourier transform to perform a 3D search of the spatial degrees of freedom between two molecules. By utilizing a pairwise statistical potential in the ZDOCK scoring function, there were notable gains in docking accuracy over previous versions, but this improvement in accuracy came at a substantial computational cost. In this study, we incorporated a recently developed 3D convolution library into ZDOCK, and additionally modified ZDOCK to dynamically orient the input proteins for more efficient convolution. These modifications resulted in an average of over 8.5-fold improvement in running time when tested on 176 cases in a newly released protein docking benchmark, as well as substantially less memory usage, with no loss in docking accuracy. We also applied these improvements to a previous version of ZDOCK that uses a simpler non-pairwise atomic potential, yielding an average speed improvement of over 5-fold on the docking benchmark, while maintaining predictive success. This permits the utilization of ZDOCK for more intensive tasks such as docking flexible molecules and modeling of interactomes, and can be run more readily by those with limited computational resources.
The CAPRI (Critical Assessment of Predicted Interactions) and CASP (Critical Assessment of protein Structure Prediction) experiments have demonstrated the power of community-wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community-wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting that there may be important physical chemistry missing in the energy calculations. A total of 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the nonpolar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were, on average, structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a nonbinder.
Protein engineering is becoming increasingly important for pharmaceutical applications where controlling the specificity and affinity of engineered proteins is required to create targeted protein therapeutics. Affinity increases of several thousand-fold are now routine for a variety of protein engineering approaches, and the structural and energetic bases of affinity maturation have been investigated in a number of such cases. Previously, a 3-million-fold affinity maturation process was achieved in a protein-protein interaction composed of a variant T-cell receptor fragment and a bacterial superantigen. Here, we present the molecular basis of this affinity increase. Using X-ray crystallography, shotgun reversion/replacement scanning mutagenesis, and computational analysis, we describe, in molecular detail, a process by which extrainterfacial regions of a protein complex can be rationally manipulated to significantly improve protein engineering outcomes.
We report the performance of the ZDOCK and ZRANK algorithms in CAPRI rounds 13-19 and introduce a novel measure atom contact frequency (ACF). To compute ACF, we identify the residues that most often make contact with the binding partner in the complete set of ZDOCK predictions for each target. We used ACF to predict the interface of the proteins, which, in combination with the biological data available in the literature, is a valuable addition to our docking pipeline. Furthermore, we incorporated a straightforward and efficient clustering algorithm with two purposes: (1) to determine clusters of similar docking poses (corresponding to energy funnels) and (2) to remove redundancies from the final set of predictions. With these new developments, we achieved at least one acceptable prediction for targets 29 and 36, at least one medium-quality prediction for targets 41 and 42, and at least one high-quality prediction for targets 37 and 40; thus, we succeeded for six out of a total of 12 targets.
Understanding the energetic and structural response to multiple mutations in a protein-protein interface is a key aspect of rational protein design. Here we investigate the cooperativity of combinations of point mutations of a T cell receptor (TCR) that binds in vivo to HLA-A2 MHC and a viral peptide. The mutations were obtained from two sources: a structure-based design study on the TCR alpha chain (nine mutations) and an in vitro selection study on the TCR beta chain (four mutations). In addition to combining the highest-affinity variants from each chain, we tested other combinations of mutations within and among the chains, for a total of 23 TCR mutants that we measured for binding kinetics to the peptide and major histocompatibility complex. A wide range of binding affinities was observed, from 2- to 1000-fold binding improvement versus that of the wild type, with significant nonadditive effects observed within and between TCR chains. This included an amino acid-dependent cooperative interaction between CDR1 and CDR3 residues that are separated by more than 9 A in the wild-type complex. When analyzing the kinetics of the mutations, we found that the association rates were primarily responsible for the cooperativity, while the dissociation rates were responsible for the anticooperativity (less-than-additive energetics). On the basis of structural modeling of anticooperative mutants, we determined that side chain clash between proximal mutants likely led to nonadditive binding energies. These results highlight the complex nature of TCR association and binding and will be informative in future design efforts that combine multiple mutant residues.
The latency-associated nuclear antigen (LANA) of Kaposis sarcoma-associated herpesvirus functions as an origin-binding protein (OBP) and transcriptional regulator. LANA binds the terminal repeats via the C-terminal DNA-binding domain (DBD) to support latent DNA replication. To date, the structure of LANA has not been solved. Sequence alignments among OBPs of gammaherpesviruses have revealed that the C terminus of LANA is structurally related to EBNA1, the OBP of Epstein-Barr virus. Based on secondary structure predictions for LANA(DBD) and published structures of EBNA1(DBD), this study used bioinformatics tools to model a putative structure for LANA(DBD) bound to DNA. To validate the predicted model, 38 mutants targeting the most conserved motifs, namely three alpha-helices and a conserved proline loop, were constructed and functionally tested. In agreement with data for EBNA1, residues in helices 1 and 2 mainly contributed to sequence-specific DNA binding and replication activity, whilst mutations in helix 3 affected replication activity and multimer formation. Additionally, several mutants were isolated with discordant phenotypes, which may aid further studies into LANA function. In summary, these data suggest that the secondary and tertiary structures of LANA and EBNA1 DBDs are conserved and are critical for (i) sequence-specific DNA binding, (ii) multimer formation, (iii) LANA-dependent transcriptional repression, and (iv) DNA replication.
T-cell receptors (TCRs) are proteins that recognize peptides from foreign proteins bound to the major histocompatibility complex (MHC) on the surface of an antigen-presenting cell. This interaction enables the T cells to initiate a cell-mediated immune response to terminate cells displaying the foreign peptide on their MHC. Naturally occurring TCRs have high specificity but low affinity toward the peptide-MHC (pepMHC) complex. This prevents the usage of solubilized TCRs for diagnosis and treatment of viral infections or cancers. Efforts to enhance the binding affinity of several TCRs have been reported in recent years, through randomized libraries and in vitro selection. However, there have been no reported efforts to enhance the affinity via structure-based design, which allows more control and understanding of the mechanism of improvement. Here, we have applied structure-based design to a human TCR to improve its pepMHC binding. Our design method evolved based on iterative steps of prediction, testing, and generating more predictions based on the new data. The final design function, named ZAFFI, has a correlation of 0.77 and average error of 0.35 kcal/mol with the binding free energies of 26 point mutations for this system that we measured by surface plasmon resonance (SPR). Applying the filter that we developed to remove nonbinding predictions, this correlation increases to 0.85, and the average error decreases to 0.3 kcal/mol. Using this algorithm, we predicted and tested several point mutations that improved binding, with one giving over sixfold binding improvement. Four of the point mutations that improved binding were then combined to give a mutant TCR that binds the pepMHC 99 times more strongly than the wild-type TCR.
The Encyclopedia of DNA Elements (ENCODE) consortium aims to identify all functional elements in the human genome including transcripts, transcriptional regulatory regions, along with their chromatin states and DNA methylation patterns. The ENCODE project generates data utilizing a variety of techniques that can enrich for regulatory regions, such as chromatin immunoprecipitation (ChIP), micrococcal nuclease (MNase) digestion and DNase I digestion, followed by deeply sequencing the resulting DNA. As part of the ENCODE project, we have developed a Web-accessible repository accessible at http://factorbook.org. In Wiki format, factorbook is a transcription factor (TF)-centric repository of all ENCODE ChIP-seq datasets on TF-binding regions, as well as the rich analysis results of these data. In the first release, factorbook contains 457 ChIP-seq datasets on 119 TFs in a number of human cell lines, the average profiles of histone modifications and nucleosome positioning around the TF-binding regions, sequence motifs enriched in the regions and the distance and orientation preferences between motif sites.
Chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-seq) has become the dominant technique for mapping transcription factor (TF) binding regions genome-wide. We performed an integrative analysis centered around 457 ChIP-seq data sets on 119 human TFs generated by the ENCODE Consortium. We identified highly enriched sequence motifs in most data sets, revealing new motifs and validating known ones. The motif sites (TF binding sites) are highly conserved evolutionarily and show distinct footprints upon DNase I digestion. We frequently detected secondary motifs in addition to the canonical motifs of the TFs, indicating tethered binding and cobinding between multiple TFs. We observed significant position and orientation preferences between many cobinding TFs. Genes specifically expressed in a cell line are often associated with a greater occurrence of nearby TF binding in that cell line. We observed cell-line-specific secondary motifs that mediate the binding of the histone deacetylase HDAC2 and the enhancer-binding protein EP300. TF binding sites are located in GC-rich, nucleosome-depleted, and DNase I sensitive regions, flanked by well-positioned nucleosomes, and many of these features show cell type specificity. The GC-richness may be beneficial for regulating TF binding because, when unoccupied by a TF, these regions are occupied by nucleosomes in vivo. We present the results of our analysis in a TF-centric web repository Factorbook (http://factorbook.org) and will continually update this repository as more ENCODE data are generated.
The expression of endogenous retrotransposable elements, including long interspersed nuclear element 1 (LINE-1 or L1) and human endogenous retrovirus, accompanies neoplastic transformation and infection with viruses such as HIV. The ability to engender immunity safely against such self-antigens would facilitate the development of novel vaccines and immunotherapies. In this article, we address the safety and immunogenicity of vaccination with these elements. We used immunohistochemical analysis and literature precedent to identify potential off-target tissues in humans and establish their translatability in preclinical species to guide safety assessments. Immunization of mice with murine L1 open reading frame 2 induced strong CD8 T cell responses without detectable tissue damage. Similarly, immunization of rhesus macaques with human LINE-1 open reading frame 2 (96% identity with macaque), as well as simian endogenous retrovirus-K Gag and Env, induced polyfunctional T cell responses to all Ags, and Ab responses to simian endogenous retrovirus-K Env. There were no adverse safety or pathological findings related to vaccination. These studies provide the first evidence, to our knowledge, that immune responses can be induced safely against this class of self-antigens and pave the way for investigation of them as HIV- or tumor-associated targets.
T cells use the ?? TCR to bind peptides presented by MHC proteins (pMHC) on APCs. Formation of a TCR-pMHC complex initiates T cell signaling via a poorly understood process, potentially involving changes in oligomeric state, altered interactions with CD3 subunits, and mechanical stress. These mechanisms could be facilitated by binding-induced changes in the TCR, but the nature and extent of any such alterations are unclear. Using hydrogen/deuterium exchange, we demonstrate that ligation globally rigidifies the TCR, which via entropic and packing effects will promote associations with neighboring proteins and enhance the stability of existing complexes. TCR regions implicated in lateral associations and signaling are particularly affected. Computational modeling demonstrated a high degree of dynamic coupling between the TCR constant and variable domains that is dampened upon ligation. These results raise the possibility that TCR triggering could involve a dynamically driven, allosteric mechanism.
The recognition potential of most families of DNA-binding domains (DBDs) remains relatively unexplored. Homeodomains (HDs), like many other families of DBDs, display limited diversity in their preferred recognition sequences. To explore the recognition potential of HDs, we utilized a bacterial selection system to isolate HD variants, from a randomized library, that are compatible with each of the 64 possible 3 triplet sites (i.e., TAANNN). The majority of these selections yielded sets of HDs with overrepresented residues at specific recognition positions, implying the selection of specific binders. The DNA-binding specificity of 151 representative HD variants was subsequently characterized, identifying HDs that preferentially recognize 44 of these target sites. Many of these variants contain novel combinations of specificity determinants that are uncommon or absent in extant HDs. These novel determinants, when grafted into different HD backbones, produce a corresponding alteration in specificity. This information was used to create more explicit HD recognition models, which can inform the prediction of transcriptional regulatory networks for extant HDs or the engineering of HDs with novel DNA-recognition potential. The diversity of recovered HD recognition sequences raises important questions about the fitness barrier that restricts the evolution of alternate recognition modalities in natural systems.
We present an energy function for predicting binding free energies of protein-protein complexes, using the three-dimensional structures of the complex and unbound proteins as input. Our function is a linear combination of nine terms and achieves a correlation coefficient of 0.63 with experimental measurements when tested on a benchmark of 144 complexes using leave-one-out cross validation. Although we systematically tested both atomic and residue-based scoring functions, the selected function is dominated by residue-based terms. Our function is stable for subsets of the benchmark stratified by experimental pH and extent of conformational change upon complex formation, with correlation coefficients ranging from 0.61 to 0.66.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.