The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
22 Related JoVE Articles!
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
Diffusion Tensor Magnetic Resonance Imaging in the Analysis of Neurodegenerative Diseases
Institutions: University of Ulm.
Diffusion tensor imaging (DTI) techniques provide information on the microstructural processes of the cerebral white matter (WM) in vivo
. The present applications are designed to investigate differences of WM involvement patterns in different brain diseases, especially neurodegenerative disorders, by use of different DTI analyses in comparison with matched controls.
DTI data analysis is performed in a variate fashion, i.e.
voxelwise comparison of regional diffusion direction-based metrics such as fractional anisotropy (FA), together with fiber tracking (FT) accompanied by tractwise fractional anisotropy statistics (TFAS) at the group level in order to identify differences in FA along WM structures, aiming at the definition of regional patterns of WM alterations at the group level. Transformation into a stereotaxic standard space is a prerequisite for group studies and requires thorough data processing to preserve directional inter-dependencies. The present applications show optimized technical approaches for this preservation of quantitative and directional information during spatial normalization in data analyses at the group level. On this basis, FT techniques can be applied to group averaged data in order to quantify metrics information as defined by FT. Additionally, application of DTI methods, i.e.
differences in FA-maps after stereotaxic alignment, in a longitudinal analysis at an individual subject basis reveal information about the progression of neurological disorders. Further quality improvement of DTI based results can be obtained during preprocessing by application of a controlled elimination of gradient directions with high noise levels.
In summary, DTI is used to define a distinct WM pathoanatomy of different brain diseases by the combination of whole brain-based and tract-based DTI analysis.
Medicine, Issue 77, Neuroscience, Neurobiology, Molecular Biology, Biomedical Engineering, Anatomy, Physiology, Neurodegenerative Diseases, nuclear magnetic resonance, NMR, MR, MRI, diffusion tensor imaging, fiber tracking, group level comparison, neurodegenerative diseases, brain, imaging, clinical techniques
Metabolic Labeling and Membrane Fractionation for Comparative Proteomic Analysis of Arabidopsis thaliana Suspension Cell Cultures
Institutions: Max Plank Institute of Molecular Plant Physiology, University of Hohenheim.
Plasma membrane microdomains are features based on the physical properties of the lipid and sterol environment and have particular roles in signaling processes. Extracting sterol-enriched membrane microdomains from plant cells for proteomic analysis is a difficult task mainly due to multiple preparation steps and sources for contaminations from other cellular compartments. The plasma membrane constitutes only about 5-20% of all the membranes in a plant cell, and therefore isolation of highly purified plasma membrane fraction is challenging. A frequently used method involves aqueous two-phase partitioning in polyethylene glycol and dextran, which yields plasma membrane vesicles with a purity of 95% 1
. Sterol-rich membrane microdomains within the plasma membrane are insoluble upon treatment with cold nonionic detergents at alkaline pH. This detergent-resistant membrane fraction can be separated from the bulk plasma membrane by ultracentrifugation in a sucrose gradient 2
. Subsequently, proteins can be extracted from the low density band of the sucrose gradient by methanol/chloroform precipitation. Extracted protein will then be trypsin digested, desalted and finally analyzed by LC-MS/MS. Our extraction protocol for sterol-rich microdomains is optimized for the preparation of clean detergent-resistant membrane fractions from Arabidopsis thaliana
We use full metabolic labeling of Arabidopsis thaliana
suspension cell cultures with K15
as the only nitrogen source for quantitative comparative proteomic studies following biological treatment of interest 3
. By mixing equal ratios of labeled and unlabeled cell cultures for joint protein extraction the influence of preparation steps on final quantitative result is kept at a minimum. Also loss of material during extraction will affect both control and treatment samples in the same way, and therefore the ratio of light and heave peptide will remain constant. In the proposed method either labeled or unlabeled cell culture undergoes a biological treatment, while the other serves as control 4
Empty Value, Issue 79, Cellular Structures, Plants, Genetically Modified, Arabidopsis, Membrane Lipids, Intracellular Signaling Peptides and Proteins, Membrane Proteins, Isotope Labeling, Proteomics, plants, Arabidopsis thaliana, metabolic labeling, stable isotope labeling, suspension cell cultures, plasma membrane fractionation, two phase system, detergent resistant membranes (DRM), mass spectrometry, membrane microdomains, quantitative proteomics
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
A High Throughput MHC II Binding Assay for Quantitative Analysis of Peptide Epitopes
Institutions: Dartmouth College, University of Rhode Island, Dartmouth College.
Biochemical assays with recombinant human MHC II molecules can provide rapid, quantitative insights into immunogenic epitope identification, deletion, or design1,2
. Here, a peptide-MHC II binding assay is scaled to 384-well format. The scaled down protocol reduces reagent costs by 75% and is higher throughput than previously described 96-well protocols1,3-5
. Specifically, the experimental design permits robust and reproducible analysis of up to 15 peptides against one MHC II allele per 384-well ELISA plate. Using a single liquid handling robot, this method allows one researcher to analyze approximately ninety test peptides in triplicate over a range of eight concentrations and four MHC II allele types in less than 48 hr. Others working in the fields of protein deimmunization or vaccine design and development may find the protocol to be useful in facilitating their own work. In particular, the step-by-step instructions and the visual format of JoVE should allow other users to quickly and easily establish this methodology in their own labs.
Biochemistry, Issue 85, Immunoassay, Protein Immunogenicity, MHC II, T cell epitope, High Throughput Screen, Deimmunization, Vaccine Design
High-pressure Sapphire Cell for Phase Equilibria Measurements of CO2/Organic/Water Systems
Institutions: Georgia Institute of Technology, Georgia Institute of Technology.
The high pressure sapphire cell apparatus was constructed to visually determine the composition of multiphase systems without physical sampling. Specifically, the sapphire cell enables visual data collection from multiple loadings to solve a set of material balances to precisely determine phase composition. Ternary phase diagrams can then be established to determine the proportion of each component in each phase at a given condition. In principle, any ternary system can be studied although ternary systems (gas-liquid-liquid) are the specific examples discussed herein. For instance, the ternary THF-Water-CO2
system was studied at 25 and 40 °C and is described herein. Of key importance, this technique does not require sampling. Circumventing the possible disturbance of the system equilibrium upon sampling, inherent measurement errors, and technical difficulties of physically sampling under pressure is a significant benefit of this technique. Perhaps as important, the sapphire cell also enables the direct visual observation of the phase behavior. In fact, as the CO2
pressure is increased, the homogeneous THF-Water solution phase splits at about 2 MPa. With this technique, it was possible to easily and clearly observe the cloud point and determine the composition of the newly formed phases as a function of pressure.
The data acquired with the sapphire cell technique can be used for many applications. In our case, we measured swelling and composition for tunable solvents, like gas-expanded liquids, gas-expanded ionic liquids and Organic Aqueous Tunable Systems (OATS)1-4
. For the latest system, OATS, the high-pressure sapphire cell enabled the study of (1) phase behavior as a function of pressure and temperature, (2) composition of each phase (gas-liquid-liquid) as a function of pressure and temperature and (3) catalyst partitioning in the two liquid phases as a function of pressure and composition. Finally, the sapphire cell is an especially effective tool to gather accurate and reproducible measurements in a timely fashion.
Chemistry, Issue 83, Phase equilibria, High-pressure, green chemistry, Green Engineering, Sapphire cell, cathetometer, ternary phase diagrams
Multi-step Preparation Technique to Recover Multiple Metabolite Compound Classes for In-depth and Informative Metabolomic Analysis
Institutions: National Jewish Health, University of Colorado Denver.
Metabolomics is an emerging field which enables profiling of samples from living organisms in order to obtain insight into biological processes. A vital aspect of metabolomics is sample preparation whereby inconsistent techniques generate unreliable results. This technique encompasses protein precipitation, liquid-liquid extraction, and solid-phase extraction as a means of fractionating metabolites into four distinct classes. Improved enrichment of low abundance molecules with a resulting increase in sensitivity is obtained, and ultimately results in more confident identification of molecules. This technique has been applied to plasma, bronchoalveolar lavage fluid, and cerebrospinal fluid samples with volumes as low as 50 µl. Samples can be used for multiple downstream applications; for example, the pellet resulting from protein precipitation can be stored for later analysis. The supernatant from that step undergoes liquid-liquid extraction using water and strong organic solvent to separate the hydrophilic and hydrophobic compounds. Once fractionated, the hydrophilic layer can be processed for later analysis or discarded if not needed. The hydrophobic fraction is further treated with a series of solvents during three solid-phase extraction steps to separate it into fatty acids, neutral lipids, and phospholipids. This allows the technician the flexibility to choose which class of compounds is preferred for analysis. It also aids in more reliable metabolite identification since some knowledge of chemical class exists.
Bioengineering, Issue 89, plasma, chemistry techniques, analytical, solid phase extraction, mass spectrometry, metabolomics, fluids and secretions, profiling, small molecules, lipids, liquid chromatography, liquid-liquid extraction, cerebrospinal fluid, bronchoalveolar lavage fluid
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
Identifying Protein-protein Interaction Sites Using Peptide Arrays
Institutions: The Hebrew University of Jerusalem.
Protein-protein interactions mediate most of the processes in the living cell and control homeostasis of the organism. Impaired protein interactions may result in disease, making protein interactions important drug targets. It is thus highly important to understand these interactions at the molecular level. Protein interactions are studied using a variety of techniques ranging from cellular and biochemical assays to quantitative biophysical assays, and these may be performed either with full-length proteins, with protein domains or with peptides. Peptides serve as excellent tools to study protein interactions since peptides can be easily synthesized and allow the focusing on specific interaction sites. Peptide arrays enable the identification of the interaction sites between two proteins as well as screening for peptides that bind the target protein for therapeutic purposes. They also allow high throughput SAR studies. For identification of binding sites, a typical peptide array usually contains partly overlapping 10-20 residues peptides derived from the full sequences of one or more partner proteins of the desired target protein. Screening the array for binding the target protein reveals the binding peptides, corresponding to the binding sites in the partner proteins, in an easy and fast method using only small amount of protein.
In this article we describe a protocol for screening peptide arrays for mapping the interaction sites between a target protein and its partners. The peptide array is designed based on the sequences of the partner proteins taking into account their secondary structures. The arrays used in this protocol were Celluspots arrays prepared by INTAVIS Bioanalytical Instruments. The array is blocked to prevent unspecific binding and then incubated with the studied protein. Detection using an antibody reveals the binding peptides corresponding to the specific interaction sites between the proteins.
Molecular Biology, Issue 93, peptides, peptide arrays, protein-protein interactions, binding sites, peptide synthesis, micro-arrays
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues
Institutions: University of Zurich.
Mori's Uncanny Valley Hypothesis1,2
proposes that the perception of humanlike characters such as robots and, by extension, avatars (computer-generated characters) can evoke negative or positive affect (valence) depending on the object's degree of visual and behavioral realism along a dimension of human likeness
) (Figure 1
). But studies of affective valence of subjective responses to variously realistic non-human characters have produced inconsistent findings 3, 4, 5, 6
. One of a number of reasons for this is that human likeness is not perceived as the hypothesis assumes. While the DHL can be defined following Mori's description as a smooth linear change in the degree of physical humanlike similarity, subjective perception of objects along the DHL can be understood in terms of the psychological effects of categorical perception (CP) 7
. Further behavioral and neuroimaging investigations of category processing and CP along the DHL and of the potential influence of the dimension's underlying category structure on affective experience are needed. This protocol therefore focuses on the DHL and allows examination of CP. Based on the protocol presented in the video as an example, issues surrounding the methodology in the protocol and the use in "uncanny" research of stimuli drawn from morph continua to represent the DHL are discussed in the article that accompanies the video. The use of neuroimaging and morph stimuli to represent the DHL in order to disentangle brain regions neurally responsive to physical human-like similarity from those responsive to category change and category processing is briefly illustrated.
Behavior, Issue 76, Neuroscience, Neurobiology, Molecular Biology, Psychology, Neuropsychology, uncanny valley, functional magnetic resonance imaging, fMRI, categorical perception, virtual reality, avatar, human likeness, Mori, uncanny valley hypothesis, perception, magnetic resonance imaging, MRI, imaging, clinical techniques
MISSION esiRNA for RNAi Screening in Mammalian Cells
Institutions: Max Planck Institute of Molecular Cell Biology and Genetics.
RNA interference (RNAi) is a basic cellular mechanism for the control of gene expression. RNAi is induced by short double-stranded RNAs also known as small interfering RNAs (siRNAs). The short double-stranded RNAs originate from longer double stranded precursors by the activity of Dicer, a protein of the RNase III family of endonucleases. The resulting fragments are components of the RNA-induced silencing complex (RISC), directing it to the cognate target mRNA. RISC cleaves the target mRNA thereby reducing the expression of the encoded protein1,2,3
. RNAi has become a powerful and widely used experimental method for loss of gene function studies in mammalian cells utilizing small interfering RNAs.
Currently two main methods are available for the production of small interfering RNAs. One method involves chemical synthesis, whereas an alternative method employs endonucleolytic cleavage of target specific long double-stranded RNAs by RNase III in vitro
. Thereby, a diverse pool of siRNA-like oligonucleotides is produced which is also known as endoribonuclease-prepared siRNA or esiRNA. A comparison of efficacy of chemically derived siRNAs and esiRNAs shows that both triggers are potent in target-gene silencing. Differences can, however, be seen when comparing specificity. Many single chemically synthesized siRNAs produce prominent off-target effects, whereas the complex mixture inherent in esiRNAs leads to a more specific knockdown10
In this study, we present the design of genome-scale MISSION esiRNA libraries and its utilization for RNAi screening exemplified by a DNA-content screen for the identification of genes involved in cell cycle progression. We show how to optimize the transfection protocol and the assay for screening in high throughput. We also demonstrate how large data-sets can be evaluated statistically and present methods to validate primary hits. Finally, we give potential starting points for further functional characterizations of validated hits.
Cellular Biology, Issue 39, MISSION, esiRNA, RNAi, cell cycle, high throughput screening
Modified Annexin V/Propidium Iodide Apoptosis Assay For Accurate Assessment of Cell Death
Institutions: University of Alberta, University of Alberta.
Studies of cellular apoptosis have been significantly impacted since the introduction of flow cytometry-based methods. Propidium iodide (PI) is widely used in conjunction with Annexin V to determine if cells are viable, apoptotic, or necrotic through differences in plasma membrane integrity and permeability1,2
. The Annexin V/ PI protocol is a commonly used approach for studying apoptotic cells3
. PI is used more often than other nuclear stains because it is economical, stable and a good indicator of cell viability, based on its capacity to exclude dye in living cells 4,5
. The ability of PI to enter a cell is dependent upon the permeability of the membrane; PI does not stain live or early apoptotic cells due to the presence of an intact plasma membrane 1,2,6
. In late apoptotic and necrotic cells, the integrity of the plasma and nuclear membranes decreases7,8
, allowing PI to pass through the membranes, intercalate into nucleic acids, and display red fluorescence 1,2,9
. Unfortunately, we find that conventional Annexin V/ PI protocols lead to a significant number of false positive events (up to 40%), which are associated with PI staining of RNA within the cytoplasmic compartment10
. Primary cells and cell lines in a broad range of animal models are affected, with large cells (nuclear: cytoplasmic ratios <0.5) showing the highest occurrence10
. Herein, we demonstrate a modified Annexin V/ PI method that provides a significant improvement for assessment of cell death compared to conventional methods. This protocol takes advantage of changes in cellular permeability during cell fixing to promote entry of RNase A into cells following staining. Both the timing and concentration of RNase A have been optimized for removal of cytoplasmic RNA. The result is a significant improvement over conventional Annexin V/ PI protocols (< 5% events with cytoplasmic PI staining).
Cellular Biology, Issue 50, Apoptosis, cell death, propidium iodide, Annexin V, necrosis, immunology
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Modified Yeast-Two-Hybrid System to Identify Proteins Interacting with the Growth Factor Progranulin
Institutions: NYU Hospital for Joint Diseases, New York University School of Medicine.
Progranulin (PGRN), also known as granulin epithelin precursor (GEP), is a 593-amino-acid autocrine growth factor. PGRN is known to play a critical role in a variety of physiologic and disease processes, including early embryogenesis, wound healing 1
, inflammation 2, 3
, and host defense 4
. PGRN also functions as a neurotrophic factor 5
, and mutations in the PGRN gene resulting in partial loss of the PGRN protein cause frontotemporal dementia 6, 7
. Our recent studies have led to the isolation of PGRN as an important regulator of cartilage development and degradation 8-11
. Although PGRN, discovered nearly two decades ago, plays crucial roles in multiple physiological and pathological conditions, efforts to exploit the actions of PGRN and understand the mechanisms involved have been significantly hampered by our inability to identify its binding receptor(s). To address this issue, we developed a modified yeast two-hybrid (MY2H) approach based on the most commonly used GAL4 based 2-hybrid system. Compared with the conventional yeast two-hybrid screen, MY2H dramatically shortens the screen process and reduces the number of false positive clones. In addition, this approach is reproducible and reliable, and we have successfully employed this system in isolating the binding proteins of various baits, including ion channel 12
, extracellular matrix protein 10, 13
, and growth factor14
. In this paper, we describe this MY2H experimental procedure in detail using PGRN as an example that led to the identification of TNFR2 as the first known PGRN-associated receptor 14, 15
Molecular Biology, Issue 59, Modified yeast two-hybrid screen, PGRN, TNFR2, inflammation, autoimmune diseases
Avidity-based Extracellular Interaction Screening (AVEXIS) for the Scalable Detection of Low-affinity Extracellular Receptor-Ligand Interactions
Institutions: Wellcome Trust Sanger Institute.
Extracellular protein:protein interactions between secreted or membrane-tethered proteins are critical for both initiating intercellular communication and ensuring cohesion within multicellular organisms. Proteins predicted to form extracellular interactions are encoded by approximately a quarter of human genes1
, but despite their importance and abundance, the majority of these proteins have no documented binding partner. Primarily, this is due to their biochemical intractability: membrane-embedded proteins are difficult to solubilise in their native conformation and contain structurally-important posttranslational modifications. Also, the interaction affinities between receptor proteins are often characterised by extremely low interaction strengths (half-lives < 1 second) precluding their detection with many commonly-used high throughput methods2
Here, we describe an assay, AVEXIS (AVidity-based EXtracellular Interaction Screen) that overcomes these technical challenges enabling the detection of very weak protein interactions (t1/2
≤ 0.1 sec) with a low false positive rate3
. The assay is usually implemented in a high throughput format to enable the systematic screening of many thousands of interactions in a convenient microtitre plate format (Fig. 1). It relies on the production of soluble recombinant protein libraries that contain the ectodomain fragments of cell surface receptors or secreted proteins within which to screen for interactions; therefore, this approach is suitable for type I, type II, GPI-linked cell surface receptors and secreted proteins but not for multipass membrane proteins such as ion channels or transporters.
The recombinant protein libraries are produced using a convenient and high-level mammalian expression system4
, to ensure that important posttranslational modifications such as glycosylation and disulphide bonds are added. Expressed recombinant proteins are secreted into the medium and produced in two forms: a biotinylated bait which can be captured on a streptavidin-coated solid phase suitable for screening, and a pentamerised enzyme-tagged (β-lactamase) prey. The bait and prey proteins are presented to each other in a binary fashion to detect direct interactions between them, similar to a conventional ELISA (Fig. 1). The pentamerisation of the proteins in the prey is achieved through a peptide sequence from the cartilage oligomeric matrix protein (COMP) and increases the local concentration of the ectodomains thereby providing significant avidity gains to enable even very transient interactions to be detected. By normalising the activities of both the bait and prey to predetermined levels prior to screening, we have shown that interactions having monomeric half-lives of 0.1 sec can be detected with low false positive rates3
Molecular Biology, Issue 61, Receptor-ligand pairs, Extracellular protein interactions, AVEXIS, Adhesion receptors, Transient/weak interactions, High throughput screening
Polymerase Chain Reaction: Basic Protocol Plus Troubleshooting and Optimization Strategies
Institutions: University of California, Los Angeles .
In the biological sciences there have been technological advances that catapult the discipline into golden ages of discovery. For example, the field of microbiology was transformed with the advent of Anton van Leeuwenhoek's microscope, which allowed scientists to visualize prokaryotes for the first time. The development of the polymerase chain reaction (PCR) is one of those innovations that changed the course of molecular science with its impact spanning countless subdisciplines in biology. The theoretical process was outlined by Keppe and coworkers in 1971; however, it was another 14 years until the complete PCR procedure was described and experimentally applied by Kary Mullis while at Cetus Corporation in 1985. Automation and refinement of this technique progressed with the introduction of a thermal stable DNA polymerase from the bacterium Thermus aquaticus
, consequently the name Taq
PCR is a powerful amplification technique that can generate an ample supply of a specific segment of DNA (i.e., an amplicon) from only a small amount of starting material (i.e., DNA template or target sequence). While straightforward and generally trouble-free, there are pitfalls that complicate the reaction producing spurious results. When PCR fails it can lead to many non-specific DNA products of varying sizes that appear as a ladder or smear of bands on agarose gels. Sometimes no products form at all. Another potential problem occurs when mutations are unintentionally introduced in the amplicons, resulting in a heterogeneous population of PCR products. PCR failures can become frustrating unless patience and careful troubleshooting are employed to sort out and solve the problem(s). This protocol outlines the basic principles of PCR, provides a methodology that will result in amplification of most target sequences, and presents strategies for optimizing a reaction. By following this PCR guide, students should be able to:
● Set up reactions and thermal cycling conditions for a conventional PCR experiment
● Understand the function of various reaction components and their overall effect on a PCR experiment
● Design and optimize a PCR experiment for any DNA template
● Troubleshoot failed PCR experiments
Basic Protocols, Issue 63, PCR, optimization, primer design, melting temperature, Tm, troubleshooting, additives, enhancers, template DNA quantification, thermal cycler, molecular biology, genetics
Optical Recording of Suprathreshold Neural Activity with Single-cell and Single-spike Resolution
Institutions: The University of Texas at Austin.
Signaling of information in the vertebrate central nervous system is often carried by populations of neurons rather than individual neurons. Also propagation of suprathreshold spiking activity involves populations of neurons. Empirical studies addressing cortical function directly thus require recordings from populations of neurons with high resolution. Here we describe an optical method and a deconvolution algorithm to record neural activity from up to 100 neurons with single-cell and single-spike resolution. This method relies on detection of the transient increases in intracellular somatic calcium concentration associated with suprathreshold electrical spikes (action potentials) in cortical neurons. High temporal resolution of the optical recordings is achieved by a fast random-access scanning technique using acousto-optical deflectors (AODs)1
. Two-photon excitation of the calcium-sensitive dye results in high spatial resolution in opaque brain tissue2
. Reconstruction of spikes from the fluorescence calcium recordings is achieved by a maximum-likelihood method. Simultaneous electrophysiological and optical recordings indicate that our method reliably detects spikes (>97% spike detection efficiency), has a low rate of false positive spike detection (< 0.003 spikes/sec), and a high temporal precision (about 3 msec) 3
. This optical method of spike detection can be used to record neural activity in vitro
and in anesthetized animals in vivo3,4
Neuroscience, Issue 67, functional calcium imaging, spatiotemporal patterns of activity, dithered random-access scanning
Identification of Protein Complexes in Escherichia coli using Sequential Peptide Affinity Purification in Combination with Tandem Mass Spectrometry
Institutions: University of Toronto, University of Regina, University of Toronto.
Since most cellular processes are mediated by macromolecular assemblies, the systematic identification of protein-protein interactions (PPI) and the identification of the subunit composition of multi-protein complexes can provide insight into gene function and enhance understanding of biological systems1, 2
. Physical interactions can be mapped with high confidence vialarge-scale isolation and characterization of endogenous protein complexes under near-physiological conditions based on affinity purification of chromosomally-tagged proteins in combination with mass spectrometry (APMS). This approach has been successfully applied in evolutionarily diverse organisms, including yeast, flies, worms, mammalian cells, and bacteria1-6
. In particular, we have generated a carboxy-terminal Sequential Peptide Affinity (SPA) dual tagging system for affinity-purifying native protein complexes from cultured gram-negative Escherichia coli
, using genetically-tractable host laboratory strains that are well-suited for genome-wide investigations of the fundamental biology and conserved processes of prokaryotes1, 2, 7
. Our SPA-tagging system is analogous to the tandem affinity purification method developed originally for yeast8, 9
, and consists of a calmodulin binding peptide (CBP) followed by the cleavage site for the highly specific tobacco etch virus
(TEV) protease and three copies of the FLAG epitope (3X FLAG), allowing for two consecutive rounds of affinity enrichment. After cassette amplification, sequence-specific linear PCR products encoding the SPA-tag and a selectable marker are integrated and expressed in frame as carboxy-terminal fusions in a DY330 background that is induced to transiently express a highly efficient heterologous bacteriophage lambda recombination system10
. Subsequent dual-step purification using calmodulin and anti-FLAG affinity beads enables the highly selective and efficient recovery of even low abundance protein complexes from large-scale cultures. Tandem mass spectrometry is then used to identify the stably co-purifying proteins with high sensitivity (low nanogram detection limits).
Here, we describe detailed step-by-step procedures we commonly use for systematic protein tagging, purification and mass spectrometry-based analysis of soluble protein complexes from E. coli
, which can be scaled up and potentially tailored to other bacterial species, including certain opportunistic pathogens that are amenable to recombineering. The resulting physical interactions can often reveal interesting unexpected components and connections suggesting novel mechanistic links. Integration of the PPI data with alternate molecular association data such as genetic (gene-gene) interactions and genomic-context (GC) predictions can facilitate elucidation of the global molecular organization of multi-protein complexes within biological pathways. The networks generated for E. coli
can be used to gain insight into the functional architecture of orthologous gene products in other microbes for which functional annotations are currently lacking.
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, affinity purification, Escherichia coli, gram-negative bacteria, cytosolic proteins, SPA-tagging, homologous recombination, mass spectrometry, protein interaction, protein complex
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Interview: Protein Folding and Studies of Neurodegenerative Diseases
Institutions: MIT - Massachusetts Institute of Technology.
In this interview, Dr. Lindquist describes relationships between protein folding, prion diseases and neurodegenerative disorders. The problem of the protein folding is at the core of the modern biology. In addition to their traditional biochemical functions, proteins can mediate transfer of biological information and therefore can be considered a genetic material. This recently discovered function of proteins has important implications for studies of human disorders. Dr. Lindquist also describes current experimental approaches to investigate the mechanism of neurodegenerative diseases based on genetic studies in model organisms.
Neuroscience, issue 17, protein folding, brain, neuron, prion, neurodegenerative disease, yeast, screen, Translational Research