The cell-biological program termed the epithelial-mesenchymal transition (EMT) confers on cancer cells mesenchymal traits and an ability to enter the cancer stem cell (CSC) state. However, the interactions between CSCs and their surrounding microenvironment are poorly understood. Here we show that tumour-associated monocytes and macrophages (TAMs) create a CSC niche through juxtacrine signalling with CSCs. We performed quantitative proteomic profiling and found that the EMT program upregulates the expression of CD90, also known as Thy1, and EphA4, which mediate the physical interactions of CSCs with TAMs by directly binding with their respective counter-receptors on these cells. In response, the EphA4 receptor on the carcinoma cells activates Src and NF-?B. In turn, NF-?B in the CSCs induces the secretion of a variety of cytokines that serve to sustain the stem cell state. Indeed, admixed macrophages enhance the CSC activities of carcinoma cells. These findings underscore the significance of TAMs as important components of the CSC niche.
Despite the staggering diversity of venomous animals, there seems to be remarkable convergence in regard to the types of proteins used as toxin scaffolds. However, our understanding of this fascinating area of evolution has been hampered by the narrow taxonomical range studied, with entire groups of venomous animals remaining almost completely unstudied. One such group is centipedes, class Chilopoda, which emerged about 440 Ma and may represent the oldest terrestrial venomous lineage next to scorpions. Here, we provide the first comprehensive insight into the chilopod "venome" and its evolution, which has revealed novel and convergent toxin recruitments as well as entirely new toxin families among both high- and low molecular weight venom components. The ancient evolutionary history of centipedes is also apparent from the differences between the Scolopendromorpha and Scutigeromorpha venoms, which diverged over 430 Ma, and appear to employ substantially different venom strategies. The presence of a wide range of novel proteins and peptides in centipede venoms highlights these animals as a rich source of novel bioactive molecules. Understanding the evolutionary processes behind these ancient venom systems will not only broaden our understanding of which traits make proteins and peptides amenable to neofunctionalization but it may also aid in directing bioprospecting efforts.
Protein abundance and phosphorylation convey important information about pathway activity and molecular pathophysiology in diseases including cancer, providing biological insight, informing drug and diagnostic development, and guiding therapeutic intervention. Analyzed tissues are usually collected without tight regulation or documentation of ischemic time. To evaluate the impact of ischemia, we collected human ovarian tumor and breast cancer xenograft tissue without vascular interruption and performed quantitative proteomics and phosphoproteomics after defined ischemic intervals. Although the global expressed proteome and most of the >25,000 quantified phosphosites were unchanged after 60 min, rapid phosphorylation changes were observed in up to 24% of the phosphoproteome, representing activation of critical cancer pathways related to stress response, transcriptional regulation, and cell death. Both pan-tumor and tissue-specific changes were observed. The demonstrated impact of pre-analytical tissue ischemia on tumor biology mandates caution in interpreting stress-pathway activation in such samples and motivates reexamination of collection protocols for phosphoprotein analysis.
The extracellular matrix (ECM) is a major component of tumors and a significant contributor to cancer progression. In this study, we use proteomics to investigate the ECM of human mammary carcinoma xenografts and show that primary tumors of differing metastatic potential differ in ECM composition. Both tumor cells and stromal cells contribute to the tumor matrix and tumors of differing metastatic ability differ in both tumor- and stroma-derived ECM components. We define ECM signatures of poorly and highly metastatic mammary carcinomas and these signatures reveal up-regulation of signaling pathways including TGF? and VEGF. We further demonstrate that several proteins characteristic of highly metastatic tumors (LTBP3, SNED1, EGLN1, and S100A2) play causal roles in metastasis, albeit at different steps. Finally we show that high expression of LTBP3 and SNED1 correlates with poor outcome for ER(-)/PR(-)breast cancer patients. This study thus identifies novel biomarkers that may serve as prognostic and diagnostic tools. DOI: http://dx.doi.org/10.7554/eLife.01308.001.
Colorectal cancer is the third most frequently diagnosed cancer and the third cause of cancer deaths in the United States. Despite the fact that tumor cell-intrinsic mechanisms controlling colorectal carcinogenesis have been identified, novel prognostic and diagnostic tools as well as novel therapeutic strategies are still needed to monitor and target colon cancer progression. We and others have previously shown, using mouse models, that the extracellular matrix (ECM), a major component of the tumor microenvironment, is an important contributor to tumor progression. In order to identify candidate biomarkers, we sought to define ECM signatures of metastatic colorectal cancers and their metastases to the liver.
The proteome informatics research group of the Association of Biomolecular Resource Facilities conducted a study to assess the communitys ability to detect and characterize peptides bearing a range of biologically occurring post-translational modifications when present in a complex peptide background. A data set derived from a mixture of synthetic peptides with biologically occurring modifications combined with a yeast whole cell lysate as background was distributed to a large group of researchers and their results were collectively analyzed. The results from the twenty-four participants, who represented a broad spectrum of experience levels with this type of data analysis, produced several important observations. First, there is significantly more variability in the ability to assess whether a results is significant than there is to determine the correct answer. Second, labile post-translational modifications, particularly tyrosine sulfation, present a challenge for most researchers. Finally, for modification site localization there are many tools being employed, but researchers are currently unsure of the reliability of the results these programs are producing.
Full-length de novo sequencing of unknown proteins remains a challenging open problem. Traditional methods that sequence spectra individually are limited by short peptide length, incomplete peptide fragmentation, and ambiguous de novo interpretations. We address these issues by determining consensus sequences for assembled tandem mass (MS/MS) spectra from overlapping peptides (e.g., by using multiple enzymatic digests). We have combined electron-transfer dissociation (ETD) with collision-induced dissociation (CID) and higher-energy collision-induced dissociation (HCD) fragmentation methods to boost interpretation of long, highly charged peptides and take advantage of corroborating b/y/c/z ions in CID/HCD/ETD. Using these strategies, we show that triplet CID/HCD/ETD MS/MS spectra from overlapping peptides yield de novo sequences of average length 70 AA and as long as 200 AA at up to 99% sequencing accuracy.
We report a mass spectrometry-based method for the integrated analysis of protein expression, phosphorylation, ubiquitination and acetylation by serial enrichments of different post-translational modifications (SEPTM) from the same biological sample. This technology enabled quantitative analysis of nearly 8,000 proteins and more than 20,000 phosphorylation, 15,000 ubiquitination and 3,000 acetylation sites per experiment, generating a holistic view of cellular signal transduction pathways as exemplified by analysis of bortezomib-treated human leukemia cells.
Labeling of primary amines on peptides with reagents containing stable isotopes is a commonly used technique in quantitative mass spectrometry. Isobaric labeling techniques such as iTRAQ™ or TMT™ allow for relative quantification of peptides based on ratios of reporter ions in the low m/z region of spectra produced by precursor ion fragmentation. In contrast, nonisobaric labeling with mTRAQ™ yields precursors with different masses that can be directly quantified in MS1 spectra. In this study, we compare iTRAQ- and mTRAQ-based quantification of peptides and phosphopeptides derived from EGF-stimulated HeLa cells. Both labels have identical chemical structures, therefore precursor ion- and fragment ion-based quantification can be directly compared. Our results indicate that iTRAQ labeling has an additive effect on precursor intensities, whereas mTRAQ labeling leads to more redundant MS2 scanning events caused by triggering on the same peptide with different mTRAQ labels. We found that iTRAQ labeling quantified nearly threefold more phosphopeptides (12,129 versus 4,448) and nearly twofold more proteins (2,699 versus 1,597) than mTRAQ labeling. Although most key proteins in the EGFR signaling network were quantified with both techniques, iTRAQ labeling allowed quantification of twice as many kinases. Accuracy of reporter ion quantification by iTRAQ is adversely affected by peptides that are cofragmented in the same precursor isolation window, dampening observed ratios toward unity. However, because of tighter overall iTRAQ ratio distributions, the percentage of statistically significantly regulated phosphopeptides and proteins detected by iTRAQ and mTRAQ was similar. We observed a linear correlation of logarithmic iTRAQ to mTRAQ ratios over two orders of magnitude, indicating a possibility to correct iTRAQ ratios by an average compression factor. Spike-in experiments using peptides of defined ratios in a background of nonregulated peptides show that iTRAQ quantification is less accurate but not as variable as mTRAQ quantification.
The extracellular matrix (ECM) is a complex meshwork of cross-linked proteins providing both biophysical and biochemical cues that are important regulators of cell proliferation, survival, differentiation, and migration. We present here a proteomic strategy developed to characterize the in vivo ECM composition of normal tissues and tumors using enrichment of protein extracts for ECM components and subsequent analysis by mass spectrometry. In parallel, we have developed a bioinformatic approach to predict the in silico "matrisome" defined as the ensemble of ECM proteins and associated factors. We report the characterization of the extracellular matrices of murine lung and colon, each comprising more than 100 ECM proteins and each presenting a characteristic signature. Moreover, using human tumor xenografts in mice, we show that both tumor cells and stromal cells contribute to the production of the tumor matrix and that tumors of differing metastatic potential differ in both the tumor- and the stroma-derived ECM components. The strategy we describe and illustrate here can be broadly applied and, to facilitate application of these methods by others, we provide resources including laboratory protocols, inventories of ECM domains and proteins, and instructions for bioinformatically deriving the human and mouse matrisome.
Peptide identification via tandem mass spectrometry sequence database searching is a key method in the array of tools available to the proteomics researcher. The ability to rapidly and sensitively acquire tandem mass spectrometry data and perform peptide and protein identifications has become a commonly used proteomics analysis technique because of advances in both instrumentation and software. Although many different tandem mass spectrometry database search tools are currently available from both academic and commercial sources, these algorithms share similar core elements while maintaining distinctive features. This review revisits the mechanism of sequence database searching and discusses how various parameter settings impact the underlying search.
Memory formation is modulated by pre- and post-synaptic signaling events in neurons. The neuronal protein kinase Cyclin-Dependent Kinase 5 (Cdk5) phosphorylates a variety of synaptic substrates and is implicated in memory formation. It has also been shown to play a role in homeostatic regulation of synaptic plasticity in cultured neurons. Surprisingly, we found that Cdk5 loss of function in hippocampal circuits results in severe impairments in memory formation and retrieval. Moreover, Cdk5 loss of function in the hippocampus disrupts cAMP signaling due to an aberrant increase in phosphodiesterase (PDE) proteins. Dysregulation of cAMP is associated with defective CREB phosphorylation and disrupted composition of synaptic proteins in Cdk5-deficient mice. Rolipram, a PDE4 inhibitor that prevents cAMP depletion, restores synaptic plasticity and memory formation in Cdk5-deficient mice. Collectively, our results demonstrate a critical role for Cdk5 in the regulation of cAMP-mediated hippocampal functions essential for synaptic plasticity and memory formation.
We developed a pipeline to integrate the proteomic technologies used from the discovery to the verification stages of plasma biomarker identification and applied it to identify early biomarkers of cardiac injury from the blood of patients undergoing a therapeutic, planned myocardial infarction (PMI) for treatment of hypertrophic cardiomyopathy. Sampling of blood directly from patient hearts before, during and after controlled myocardial injury ensured enrichment for candidate biomarkers and allowed patients to serve as their own biological controls. LC-MS/MS analyses detected 121 highly differentially expressed proteins, including previously credentialed markers of cardiovascular disease and >100 novel candidate biomarkers for myocardial infarction (MI). Accurate inclusion mass screening (AIMS) qualified a subset of the candidates based on highly specific, targeted detection in peripheral plasma, including some markers unlikely to have been identified without this step. Analyses of peripheral plasma from controls and patients with PMI or spontaneous MI by quantitative multiple reaction monitoring mass spectrometry or immunoassays suggest that the candidate biomarkers may be specific to MI. This study demonstrates that modern proteomic technologies, when coherently integrated, can yield novel cardiovascular biomarkers meriting further evaluation in large, heterogeneous cohorts.
Deciphering the signaling networks that underlie normal and disease processes remains a major challenge. Here, we report the discovery of signaling components involved in the Toll-like receptor (TLR) response of immune dendritic cells (DCs), including a previously unkown pathway shared across mammalian antiviral responses. By combining transcriptional profiling, genetic and small-molecule perturbations, and phosphoproteomics, we uncover 35 signaling regulators, including 16 known regulators, involved in TLR signaling. In particular, we find that Polo-like kinases (Plk) 2 and 4 are essential components of antiviral pathways in vitro and in vivo and activate a signaling branch involving a dozen proteins, among which is Tnfaip2, a gene associated with autoimmune diseases but whose role was unknown. Our study illustrates the power of combining systematic measurements and perturbations to elucidate complex signaling circuits and discover potential therapeutic targets.
Selective capture of glycopolypeptides followed by release and analysis of the former glycosylation-site peptides has been shown to have promise for reducing the complexity of body fluids such as blood for biomarker discovery. In this work, a protocol based on capture of polypeptides containing a N-linked carbohydrate from human plasma using commercially available magnetic beads coupled with hydrazide chemistry was optimized and partially automated through the use of a KingFisher magnetic particle processor. Comparison of bead-based glycocapture at the protein-level vs the peptide-level revealed differences in the specificity, reproducibility, and absolute number of former glycosylation-site peptides detected. Evaluation of a range of capture and elution conditions led to an optimized protocol with a 24% intraday and 30% interday CV and a glycopeptide capture specificity of 99%. Depleting the plasma of 14 high abundance proteins improved detection sensitivity by approximately 1 order of magnitude compared to nondepleted plasma and resulted in an increase of 24% in the number of identified glycoproteins. The sensitivity of SPEG for detection of glycoproteins in depleted, non-fractionated plasma was found to be in the 10-100 pmol/mL range corresponding to glycoprotein levels ranging from 100s of nanograms/mL to 10s of micrograms/mL. Despite high capture specificity, the total number of glycoproteins detected and the sensitivity of SPEG in plasma is surprisingly limited.
Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize pre-analytical and analytical variation in comparative proteomics experiments.
A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment of system performance and evaluation of technical variability. Here we describe 46 system performance metrics for monitoring chromatographic performance, electrospray source stability, MS1 and MS2 signals, dynamic sampling of ions for MS/MS, and peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these metrics displayed consistent, reasonable responses to controlled perturbations. The metrics typically displayed variations less than 10% and thus can reveal even subtle differences in performance of system components. Analyses of data from interlaboratory studies conducted under a common standard operating procedure identified outlier data and provided clues to specific causes. Moreover, interlaboratory variation reflected by the metrics indicates which system components vary the most between laboratories. Application of these metrics enables rational, quantitative quality assessment for proteomics and other LC-MS/MS analytical applications.
The aberrant activation of tyrosine kinases represents an important oncogenic mechanism, and yet the majority of such events remain undiscovered. Here we describe a bead-based method for detecting phosphorylation of both wild-type and mutant tyrosine kinases in a multiplexed, high-throughput and low-cost manner. With the aim of establishing a tyrosine kinase-activation catalog, we used this method to profile 130 human cancer lines. Follow-up experiments on the finding that SRC is frequently phosphorylated in glioblastoma cell lines showed that SRC is also activated in primary glioblastoma patient samples and that the SRC inhibitor dasatinib (Sprycel) inhibits viability and cell migration in vitro and tumor growth in vivo. Testing of dasatinib-resistant tyrosine kinase alleles confirmed that SRC is indeed the relevant target of dasatinib, which inhibits many tyrosine kinases. These studies establish the feasibility of tyrosine kinome-wide phosphorylation profiling and point to SRC as a possible therapeutic target in glioblastoma.
Animal cells initiate cytokinesis in parallel with anaphase onset, when an actomyosin ring assembles and constricts through localized activation of the small GTPase RhoA, giving rise to a cleavage furrow. Furrow formation relies on positional cues provided by anaphase spindle microtubules (MTs), but how such cues are generated remains unclear. Using chemical genetics to achieve both temporal and spatial control, we show that the self-organized delivery of Polo-like kinase 1 (Plk1) to the midzone and its local phosphorylation of a MT-bound substrate are critical for generating this furrow-inducing signal. When Plk1 was active but unable to target itself to this equatorial landmark, both cortical RhoA recruitment and furrow induction failed to occur, thus recapitulating the effects of anaphase-specific Plk1 inhibition. Using tandem mass spectrometry and phosphospecific antibodies, we found that Plk1 binds and directly phosphorylates the HsCYK-4 subunit of centralspindlin (also known as MgcRacGAP) at the midzone. At serine 157, this modification creates a major docking site for the tandem BRCT repeats of the Rho GTP exchange factor Ect2. Cells expressing only a nonphosphorylatable form of HsCYK-4 failed to localize Ect2 at the midzone and were severely impaired in cleavage furrow formation, implying that HsCYK-4 is Plk1s rate-limiting target upstream of RhoA. Conversely, tethering an inhibitor-resistant allele of Plk1 to HsCYK-4 allowed furrows to form despite global inhibition of all other Plk1 molecules in the cell. Our findings illuminate two key mechanisms governing the initiation of cytokinesis in human cells and illustrate the power of chemical genetics to probe such regulation both in time and space.
Cell-based screening can facilitate the rapid identification of compounds inducing complex cellular phenotypes. Advancing a compound toward the clinic, however, generally requires the identification of precise mechanisms of action. We previously found that epidermal growth factor receptor (EGFR) inhibitors induce acute myeloid leukemia (AML) differentiation via a non-EGFR mechanism. In this report, we integrated proteomic and RNAi-based strategies to identify their off-target, anti-AML mechanism. These orthogonal approaches identified Syk as a target in AML. Genetic and pharmacological inactivation of Syk with a drug in clinical trial for other indications promoted differentiation of AML cells and attenuated leukemia growth in vivo. These results demonstrate the power of integrating diverse chemical, proteomic, and genomic screening approaches to identify therapeutic strategies for cancer.
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.
Ubiquitination plays a key role in protein degradation and signal transduction. Ubiquitin is a small protein modifier that is adducted to lysine residues by the combined function of E1, E2, and E3 enzymes and is removed by deubiquitinating enzymes. Characterization of ubiquitination sites is important for understanding the role of this modification in cellular processes and disease. However, until recently, large-scale characterization of endogenous ubiquitination sites has been hampered by the lack of efficient enrichment techniques. The introduction of antibodies that specifically recognize peptides with lysine residues that harbor a di-glycine remnant (K-?-GG) following tryptic digestion has dramatically improved the ability to enrich and identify ubiquitination sites from cellular lysates. We used this enrichment technique to study the effects of proteasome inhibition by MG-132 and deubiquitinase inhibition by PR-619 on ubiquitination sites in human Jurkat cells by quantitative high performance mass spectrometry. Minimal fractionation of digested lysates prior to immunoaffinity enrichment increased the yield of K-?-GG peptides three- to fourfold resulting in detection of up to ~3300 distinct K-GG peptides in SILAC triple encoded experiments starting from 5 mg of protein per label state. In total, we identify 5533 distinct K-?-GG peptides of which 4907 were quantified in this study, demonstrating that the strategy presented is a practical approach to perturbational studies in cell systems. We found that proteasome inhibition by MG-132 and deubiquitinase inhibition by PR-619 induces significant changes to the ubiquitin landscape, but that not all ubiquitination sites regulated by MG-132 and PR-619 are likely substrates for the ubiquitin-proteasome system. Additionally, we find that the proteasome and deubiquitinase inhibitors studied induced only minor changes in protein expression levels regardless of the extent of regulation induced at the ubiquitin site level. We attribute this finding to the low stoichiometry of the majority ubiquitination sites identified in this study.
Using enrichment strategies many research groups are routinely producing large data sets of post-translationally modified peptides for proteomic analysis using tandem mass spectrometry. Although search engines are relatively effective at identifying these peptides with a defined measure of reliability, their localization of site/s of modification is often arbitrary and unreliable. The field continues to be in need of a widely accepted metric for false localization rate that accurately describes the certainty of site localization in published data sets and allows for consistent measurement of differences in performance of emerging scoring algorithms. In this article are discussed the main strategies currently used by software for modification site localization and ways of assessing the performance of these different tools. Methods for representing ambiguity are reviewed and a discussion of how the approaches transfer to different data types and modifications is presented.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.