The aim of de novo protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
25 Related JoVE Articles!
Sequence-specific Labeling of Nucleic Acids and Proteins with Methyltransferases and Cofactor Analogues
Institutions: RWTH Aachen University.
-Adenosyl-l-methionine (AdoMet or SAM)-dependent methyltransferases (MTase) catalyze the transfer of the activated methyl group from AdoMet to specific positions in DNA, RNA, proteins and small biomolecules. This natural methylation reaction can be expanded to a wide variety of alkylation reactions using synthetic cofactor analogues. Replacement of the reactive sulfonium center of AdoMet with an aziridine ring leads to cofactors which can be coupled with DNA by various DNA MTases. These aziridine cofactors can be equipped with reporter groups at different positions of the adenine moiety and used for S
of DNA (SMILing DNA). As a typical example we give a protocol for biotinylation of pBR322 plasmid DNA at the 5’-ATCGA
T-3’ sequence with the DNA MTase M.BseCI and the aziridine cofactor 6BAz in one step. Extension of the activated methyl group with unsaturated alkyl groups results in another class of AdoMet analogues which are used for m
ransfer of A
roups (mTAG). Since the extended side chains are activated by the sulfonium center and the unsaturated bond, these cofactors are called double-activated AdoMet analogues. These analogues not only function as cofactors for DNA MTases, like the aziridine cofactors, but also for RNA, protein and small molecule MTases. They are typically used for enzymatic modification of MTase substrates with unique functional groups which are labeled with reporter groups in a second chemical step. This is exemplified in a protocol for fluorescence labeling of histone H3 protein. A small propargyl group is transferred from the cofactor analogue SeAdoYn to the protein by the histone H3 lysine 4 (H3K4) MTase Set7/9 followed by click labeling of the alkynylated histone H3 with TAMRA azide. MTase-mediated labeling with cofactor analogues is an enabling technology for many exciting applications including identification and functional study of MTase substrates as well as DNA genotyping and methylation detection.
Biochemistry, Issue 93, S-adenosyl-l-methionine, AdoMet, SAM, aziridine cofactor, double activated cofactor, methyltransferase, DNA methylation, protein methylation, biotin labeling, fluorescence labeling, SMILing, mTAG
The ChroP Approach Combines ChIP and Mass Spectrometry to Dissect Locus-specific Proteomic Landscapes of Chromatin
Institutions: European Institute of Oncology.
Chromatin is a highly dynamic nucleoprotein complex made of DNA and proteins that controls various DNA-dependent processes. Chromatin structure and function at specific regions is regulated by the local enrichment of histone post-translational modifications (hPTMs) and variants, chromatin-binding proteins, including transcription factors, and DNA methylation. The proteomic characterization of chromatin composition at distinct functional regions has been so far hampered by the lack of efficient protocols to enrich such domains at the appropriate purity and amount for the subsequent in-depth analysis by Mass Spectrometry (MS). We describe here a newly designed chromatin proteomics strategy, named ChroP (Chromatin Proteomics
), whereby a preparative chromatin immunoprecipitation is used to isolate distinct chromatin regions whose features, in terms of hPTMs, variants and co-associated non-histonic proteins, are analyzed by MS. We illustrate here the setting up of ChroP for the enrichment and analysis of transcriptionally silent heterochromatic regions, marked by the presence of tri-methylation of lysine 9 on histone H3. The results achieved demonstrate the potential of ChroP
in thoroughly characterizing the heterochromatin proteome and prove it as a powerful analytical strategy for understanding how the distinct protein determinants of chromatin interact and synergize to establish locus-specific structural and functional configurations.
Biochemistry, Issue 86, chromatin, histone post-translational modifications (hPTMs), epigenetics, mass spectrometry, proteomics, SILAC, chromatin immunoprecipitation , histone variants, chromatome, hPTMs cross-talks
Visualization of ATP Synthase Dimers in Mitochondria by Electron Cryo-tomography
Institutions: Max Planck Institute of Biophysics.
Electron cryo-tomography is a powerful tool in structural biology, capable of visualizing the three-dimensional structure of biological samples, such as cells, organelles, membrane vesicles, or viruses at molecular detail. To achieve this, the aqueous sample is rapidly vitrified in liquid ethane, which preserves it in a close-to-native, frozen-hydrated state. In the electron microscope, tilt series are recorded at liquid nitrogen temperature, from which 3D tomograms are reconstructed. The signal-to-noise ratio of the tomographic volume is inherently low. Recognizable, recurring features are enhanced by subtomogram averaging, by which individual subvolumes are cut out, aligned and averaged to reduce noise. In this way, 3D maps with a resolution of 2 nm or better can be obtained. A fit of available high-resolution structures to the 3D volume then produces atomic models of protein complexes in their native environment. Here we show how we use electron cryo-tomography to study the in situ
organization of large membrane protein complexes in mitochondria. We find that ATP synthases are organized in rows of dimers along highly curved apices of the inner membrane cristae, whereas complex I is randomly distributed in the membrane regions on either side of the rows. By subtomogram averaging we obtained a structure of the mitochondrial ATP synthase dimer within the cristae membrane.
Structural Biology, Issue 91, electron microscopy, electron cryo-tomography, mitochondria, ultrastructure, membrane structure, membrane protein complexes, ATP synthase, energy conversion, bioenergetics
Protein-protein Interactions Visualized by Bimolecular Fluorescence Complementation in Tobacco Protoplasts and Leaves
Institutions: Ludwig-Maximilians-Universität, München.
Many proteins interact transiently with other proteins or are integrated into multi-protein complexes to perform their biological function. Bimolecular fluorescence complementation (BiFC) is an in vivo
method to monitor such interactions in plant cells. In the presented protocol the investigated candidate proteins are fused to complementary halves of fluorescent proteins and the respective constructs are introduced into plant cells via agrobacterium-mediated transformation. Subsequently, the proteins are transiently expressed in tobacco leaves and the restored fluorescent signals can be detected with a confocal laser scanning microscope in the intact cells. This allows not only visualization of the interaction itself, but also the subcellular localization of the protein complexes can be determined. For this purpose, marker genes containing a fluorescent tag can be coexpressed along with the BiFC constructs, thus visualizing cellular structures such as the endoplasmic reticulum, mitochondria, the Golgi apparatus or the plasma membrane. The fluorescent signal can be monitored either directly in epidermal leaf cells or in single protoplasts, which can be easily isolated from the transformed tobacco leaves. BiFC is ideally suited to study protein-protein interactions in their natural surroundings within the living cell. However, it has to be considered that the expression has to be driven by strong promoters and that the interaction partners are modified due to fusion of the relatively large fluorescence tags, which might interfere with the interaction mechanism. Nevertheless, BiFC is an excellent complementary approach to other commonly applied methods investigating protein-protein interactions, such as coimmunoprecipitation, in vitro
pull-down assays or yeast-two-hybrid experiments.
Plant Biology, Issue 85, Tetratricopeptide repeat domain, chaperone, chloroplasts, endoplasmic reticulum, HSP90, Toc complex, Sec translocon, BiFC
High Throughput Quantitative Expression Screening and Purification Applied to Recombinant Disulfide-rich Venom Proteins Produced in E. coli
Institutions: Aix-Marseille Université, Commissariat à l'énergie atomique et aux énergies alternatives (CEA) Saclay, France.
Escherichia coli (E. coli)
is the most widely used expression system for the production of recombinant proteins for structural and functional studies. However, purifying proteins is sometimes challenging since many proteins are expressed in an insoluble form. When working with difficult or multiple targets it is therefore recommended to use high throughput (HTP) protein expression screening on a small scale (1-4 ml cultures) to quickly identify conditions for soluble expression. To cope with the various structural genomics programs of the lab, a quantitative (within a range of 0.1-100 mg/L culture of recombinant protein) and HTP protein expression screening protocol was implemented and validated on thousands of proteins. The protocols were automated with the use of a liquid handling robot but can also be performed manually without specialized equipment.
Disulfide-rich venom proteins are gaining increasing recognition for their potential as therapeutic drug leads. They can be highly potent and selective, but their complex disulfide bond networks make them challenging to produce. As a member of the FP7 European Venomics project (www.venomics.eu), our challenge is to develop successful production strategies with the aim of producing thousands of novel venom proteins for functional characterization. Aided by the redox properties of disulfide bond isomerase DsbC, we adapted our HTP production pipeline for the expression of oxidized, functional venom peptides in the E. coli
cytoplasm. The protocols are also applicable to the production of diverse disulfide-rich proteins. Here we demonstrate our pipeline applied to the production of animal venom proteins. With the protocols described herein it is likely that soluble disulfide-rich proteins will be obtained in as little as a week. Even from a small scale, there is the potential to use the purified proteins for validating the oxidation state by mass spectrometry, for characterization in pilot studies, or for sensitive micro-assays.
Bioengineering, Issue 89, E. coli, expression, recombinant, high throughput (HTP), purification, auto-induction, immobilized metal affinity chromatography (IMAC), tobacco etch virus protease (TEV) cleavage, disulfide bond isomerase C (DsbC) fusion, disulfide bonds, animal venom proteins/peptides
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
Nanomanipulation of Single RNA Molecules by Optical Tweezers
Institutions: University at Albany, State University of New York, University at Albany, State University of New York, University at Albany, State University of New York, University at Albany, State University of New York, University at Albany, State University of New York.
A large portion of the human genome is transcribed but not translated. In this post genomic era, regulatory functions of RNA have been shown to be increasingly important. As RNA function often depends on its ability to adopt alternative structures, it is difficult to predict RNA three-dimensional structures directly from sequence. Single-molecule approaches show potentials to solve the problem of RNA structural polymorphism by monitoring molecular structures one molecule at a time. This work presents a method to precisely manipulate the folding and structure of single RNA molecules using optical tweezers. First, methods to synthesize molecules suitable for single-molecule mechanical work are described. Next, various calibration procedures to ensure the proper operations of the optical tweezers are discussed. Next, various experiments are explained. To demonstrate the utility of the technique, results of mechanically unfolding RNA hairpins and a single RNA kissing complex are used as evidence. In these examples, the nanomanipulation technique was used to study folding of each structural domain, including secondary and tertiary, independently. Lastly, the limitations and future applications of the method are discussed.
Bioengineering, Issue 90, RNA folding, single-molecule, optical tweezers, nanomanipulation, RNA secondary structure, RNA tertiary structure
Discovering Protein Interactions and Characterizing Protein Function Using HaloTag Technology
Institutions: Promega Corporation, MS Bioworks LLC.
Research in proteomics has exploded in recent years with advances in mass spectrometry capabilities that have led to the characterization of numerous proteomes, including those from viruses, bacteria, and yeast. In comparison, analysis of the human proteome lags behind, partially due to the sheer number of proteins which must be studied, but also the complexity of networks and interactions these present. To specifically address the challenges of understanding the human proteome, we have developed HaloTag technology for protein isolation, particularly strong for isolation of multiprotein complexes and allowing more efficient capture of weak or transient interactions and/or proteins in low abundance. HaloTag is a genetically encoded protein fusion tag, designed for covalent, specific, and rapid immobilization or labelling of proteins with various ligands. Leveraging these properties, numerous applications for mammalian cells were developed to characterize protein function and here we present methodologies including: protein pull-downs used for discovery of novel interactions or functional assays, and cellular localization. We find significant advantages in the speed, specificity, and covalent capture of fusion proteins to surfaces for proteomic analysis as compared to other traditional non-covalent approaches. We demonstrate these and the broad utility of the technology using two important epigenetic proteins as examples, the human bromodomain protein BRD4, and histone deacetylase HDAC1. These examples demonstrate the power of this technology in enabling the discovery of novel interactions and characterizing cellular localization in eukaryotes, which will together further understanding of human functional proteomics.
Cellular Biology, Issue 89, proteomics, HaloTag, protein interactions, mass spectrometry, bromodomain proteins, BRD4, histone deacetylase (HDAC), HDAC cellular assays, and confocal imaging
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Metabolomic Analysis of Rat Brain by High Resolution Nuclear Magnetic Resonance Spectroscopy of Tissue Extracts
Institutions: Aix-Marseille Université, Aix-Marseille Université.
Studies of gene expression on the RNA and protein levels have long been used to explore biological processes underlying disease. More recently, genomics and proteomics have been complemented by comprehensive quantitative analysis of the metabolite pool present in biological systems. This strategy, termed metabolomics, strives to provide a global characterization of the small-molecule complement involved in metabolism. While the genome and the proteome define the tasks cells can perform, the metabolome is part of the actual phenotype. Among the methods currently used in metabolomics, spectroscopic techniques are of special interest because they allow one to simultaneously analyze a large number of metabolites without prior selection for specific biochemical pathways, thus enabling a broad unbiased approach. Here, an optimized experimental protocol for metabolomic analysis by high-resolution NMR spectroscopy is presented, which is the method of choice for efficient quantification of tissue metabolites. Important strengths of this method are (i) the use of crude extracts, without the need to purify the sample and/or separate metabolites; (ii) the intrinsically quantitative nature of NMR, permitting quantitation of all metabolites represented by an NMR spectrum with one reference compound only; and (iii) the nondestructive nature of NMR enabling repeated use of the same sample for multiple measurements. The dynamic range of metabolite concentrations that can be covered is considerable due to the linear response of NMR signals, although metabolites occurring at extremely low concentrations may be difficult to detect. For the least abundant compounds, the highly sensitive mass spectrometry method may be advantageous although this technique requires more intricate sample preparation and quantification procedures than NMR spectroscopy. We present here an NMR protocol adjusted to rat brain analysis; however, the same protocol can be applied to other tissues with minor modifications.
Neuroscience, Issue 91, metabolomics, brain tissue, rodents, neurochemistry, tissue extracts, NMR spectroscopy, quantitative metabolite analysis, cerebral metabolism, metabolic profile
In Vitro Reconstitution of Light-harvesting Complexes of Plants and Green Algae
Institutions: VU University Amsterdam.
In plants and green algae, light is captured by the light-harvesting complexes (LHCs), a family of integral membrane proteins that coordinate chlorophylls and carotenoids. In vivo
, these proteins are folded with pigments to form complexes which are inserted in the thylakoid membrane of the chloroplast. The high similarity in the chemical and physical properties of the members of the family, together with the fact that they can easily lose pigments during isolation, makes their purification in a native state challenging. An alternative approach to obtain homogeneous preparations of LHCs was developed by Plumley and Schmidt in 19871
, who showed that it was possible to reconstitute these complexes in vitro
starting from purified pigments and unfolded apoproteins, resulting in complexes with properties very similar to that of native complexes. This opened the way to the use of bacterial expressed recombinant proteins for in vitro
reconstitution. The reconstitution method is powerful for various reasons: (1) pure preparations of individual complexes can be obtained, (2) pigment composition can be controlled to assess their contribution to structure and function, (3) recombinant proteins can be mutated to study the functional role of the individual residues (e.g.,
pigment binding sites) or protein domain (e.g.,
protein-protein interaction, folding). This method has been optimized in several laboratories and applied to most of the light-harvesting complexes. The protocol described here details the method of reconstituting light-harvesting complexes in vitro
currently used in our laboratory,
and examples describing applications of the method are provided.
Biochemistry, Issue 92, Reconstitution, Photosynthesis, Chlorophyll, Carotenoids, Light Harvesting Protein, Chlamydomonas reinhardtii, Arabidopsis thaliana
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Synthesis of an Intein-mediated Artificial Protein Hydrogel
Institutions: Texas A&M University, College Station, Texas A&M University, College Station.
We present the synthesis of a highly stable protein hydrogel mediated by a split-intein-catalyzed protein trans
-splicing reaction. The building blocks of this hydrogel are two protein block-copolymers each containing a subunit of a trimeric protein that serves as a crosslinker and one half of a split intein. A highly hydrophilic random coil is inserted into one of the block-copolymers for water retention. Mixing of the two protein block copolymers triggers an intein trans
-splicing reaction, yielding a polypeptide unit with crosslinkers at either end that rapidly self-assembles into a hydrogel. This hydrogel is very stable under both acidic and basic conditions, at temperatures up to 50 °C, and in organic solvents. The hydrogel rapidly reforms after shear-induced rupture. Incorporation of a "docking station peptide" into the hydrogel building block enables convenient incorporation of "docking protein"-tagged target proteins. The hydrogel is compatible with tissue culture growth media, supports the diffusion of 20 kDa molecules, and enables the immobilization of bioactive globular proteins. The application of the intein-mediated protein hydrogel as an organic-solvent-compatible biocatalyst was demonstrated by encapsulating the horseradish peroxidase enzyme and corroborating its activity.
Bioengineering, Issue 83, split-intein, self-assembly, shear-thinning, enzyme, immobilization, organic synthesis
Optimized Negative Staining: a High-throughput Protocol for Examining Small and Asymmetric Protein Structure by Electron Microscopy
Institutions: The Molecular Foundry.
Structural determination of proteins is rather challenging for proteins with molecular masses between 40 - 200 kDa. Considering that more than half of natural proteins have a molecular mass between 40 - 200 kDa1,2
, a robust and high-throughput method with a nanometer resolution capability is needed. Negative staining (NS) electron microscopy (EM) is an easy, rapid, and qualitative approach which has frequently been used in research laboratories to examine protein structure and protein-protein interactions. Unfortunately, conventional NS protocols often generate structural artifacts on proteins, especially with lipoproteins that usually form presenting rouleaux artifacts. By using images of lipoproteins from cryo-electron microscopy (cryo-EM) as a standard, the key parameters in NS specimen preparation conditions were recently screened and reported as the optimized NS protocol (OpNS), a modified conventional NS protocol 3
. Artifacts like rouleaux can be greatly limited by OpNS, additionally providing high contrast along with reasonably high‐resolution (near 1 nm) images of small and asymmetric proteins. These high-resolution and high contrast images are even favorable for an individual protein (a single object, no average) 3D reconstruction, such as a 160 kDa antibody, through the method of electron tomography4,5
. Moreover, OpNS can be a high‐throughput tool to examine hundreds of samples of small proteins. For example, the previously published mechanism of 53 kDa cholesteryl ester transfer protein (CETP) involved the screening and imaging of hundreds of samples 6
. Considering cryo-EM rarely successfully images proteins less than 200 kDa has yet to publish any study involving screening over one hundred sample conditions, it is fair to call OpNS a high-throughput method for studying small proteins. Hopefully the OpNS protocol presented here can be a useful tool to push the boundaries of EM and accelerate EM studies into small protein structure, dynamics and mechanisms.
Environmental Sciences, Issue 90, small and asymmetric protein structure, electron microscopy, optimized negative staining
iCLIP - Transcriptome-wide Mapping of Protein-RNA Interactions with Individual Nucleotide Resolution
Institutions: Medical Research Council - MRC, EMBL Heidelberg, University of Ljubljana, Wellcome Trust Sanger Institute.
The unique composition and spatial arrangement of RNA-binding proteins (RBPs) on a transcript guide the diverse aspects of post-transcriptional regulation1
. Therefore, an essential step towards understanding transcript regulation at the molecular level is to gain positional information on the binding sites of RBPs2
Protein-RNA interactions can be studied using biochemical methods, but these approaches do not address RNA binding in its native cellular context. Initial attempts to study protein-RNA complexes in their cellular environment employed affinity purification or immunoprecipitation combined with differential display or microarray analysis (RIP-CHIP)3-5
. These approaches were prone to identifying indirect or non-physiological interactions6
. In order to increase the specificity and positional resolution, a strategy referred to as CLIP (UV cross-linking and immunoprecipitation) was introduced7,8
. CLIP combines UV cross-linking of proteins and RNA molecules with rigorous purification schemes including denaturing polyacrylamide gel electrophoresis. In combination with high-throughput sequencing technologies, CLIP has proven as a powerful tool to study protein-RNA interactions on a genome-wide scale (referred to as HITS-CLIP or CLIP-seq)9,10
. Recently, PAR-CLIP was introduced that uses photoreactive ribonucleoside analogs for cross-linking11,12
Despite the high specificity of the obtained data, CLIP experiments often generate cDNA libraries of limited sequence complexity. This is partly due to the restricted amount of co-purified RNA and the two inefficient RNA ligation reactions required for library preparation. In addition, primer extension assays indicated that many cDNAs truncate prematurely at the crosslinked nucleotide13
. Such truncated cDNAs are lost during the standard CLIP library preparation protocol. We recently developed iCLIP (individual-nucleotide resolution CLIP), which captures the truncated cDNAs by replacing one of the inefficient intermolecular RNA ligation steps with a more efficient intramolecular cDNA circularization (Figure 1)14
. Importantly, sequencing the truncated cDNAs provides insights into the position of the cross-link site at nucleotide resolution. We successfully applied iCLIP to study hnRNP C particle organization on a genome-wide scale and assess its role in splicing regulation14
Cellular Biology, Issue 50, RNA biochemistry, transcriptome, systems biology, RNA-binding protein
Visualization of Recombinant DNA and Protein Complexes Using Atomic Force Microscopy
Institutions: Seattle University, Seattle University.
Atomic force microscopy (AFM) allows for the visualizing of individual proteins, DNA molecules, protein-protein complexes, and DNA-protein complexes. On the end of the microscope's cantilever is a nano-scale probe, which traverses image areas ranging from nanometers to micrometers, measuring the elevation of macromolecules resting on the substrate surface at any given point. Electrostatic forces cause proteins, lipids, and nucleic acids to loosely attach to the substrate in random orientations and permit imaging. The generated data resemble a topographical map, where the macromolecules resolve as three-dimensional particles of discrete sizes (Figure 1
. Tapping mode AFM involves the repeated oscillation of the cantilever, which permits imaging of relatively soft biomaterials such as DNA and proteins. One of the notable benefits of AFM over other nanoscale microscopy techniques is its relative adaptability to visualize individual proteins and macromolecular complexes in aqueous buffers, including near-physiologic buffered conditions, in real-time, and without staining or coating the sample to be imaged.
The method presented here describes the imaging of DNA and an immunoadsorbed transcription factor (i.e. the glucocorticoid receptor, GR) in buffered solution (Figure 2
). Immunoadsorbed proteins and protein complexes can be separated from the immunoadsorbing antibody-bead pellet by competition with the antibody epitope and then imaged (Figure 2A
). This allows for biochemical manipulation of the biomolecules of interest prior to imaging. Once purified, DNA and proteins can be mixed and the resultant interacting complex can be imaged as well. Binding of DNA to mica requires a divalent cation 3
,such as Ni2+
, which can be added to sample buffers yet maintain protein activity. Using a similar approach, AFM has been utilized to visualize individual enzymes, including RNA polymerase 4
and a repair enzyme 5
, bound to individual DNA strands. These experiments provide significant insight into the protein-protein and DNA-protein biophysical interactions taking place at the molecular level. Imaging individual macromolecular particles with AFM can be useful for determining particle homogeneity and for identifying the physical arrangement of constituent components of the imaged particles. While the present method was developed for visualization of GR-chaperone protein complexes 1,2
and DNA strands to which the GR can bind, it can be applied broadly to imaging DNA and protein samples from a variety of sources.
Bioengineering, Issue 53, atomic force microscopy, glucocorticoid receptor, protein-protein interaction, DNA-protein interaction, scanning probe microscopy, immunoadsorption
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Using SecM Arrest Sequence as a Tool to Isolate Ribosome Bound Polypeptides
Institutions: Cleveland State University.
Extensive research has provided ample evidences suggesting that protein folding in the cell is a co-translational process1-5
. However, the exact pathway that polypeptide chain follows during co-translational folding to achieve its functional form is still an enigma. In order to understand this process and to determine the exact conformation of the co-translational folding intermediates, it is essential to develop techniques that allow the isolation of RNCs carrying nascent chains of predetermined sizes to allow their further structural analysis.
SecM (secretion monitor) is a 170 amino acid E. coli
protein that regulates expression of the downstream SecA (secretion driving) ATPase in the secM-secA
. Nakatogawa and Ito originally found that a 17 amino acid long sequence (150-FSTPVWISQAQGIRAG
P-166) in the C-terminal region of the SecM protein is sufficient and necessary to cause stalling of SecM elongation at Gly165, thereby producing peptidyl-glycyl-tRNA stably bound to the ribosomal P-site7-9
. More importantly, it was found that this 17 amino acid long sequence can be fused to the C-terminus of virtually any full-length and/or truncated protein thus allowing the production of RNCs carrying nascent chains of predetermined sizes7
. Thus, when fused or inserted into the target protein, SecM stalling sequence produces arrest of the polypeptide chain elongation and generates stable RNCs both in vivo
in E. coli
cells and in vitro
in a cell-free system. Sucrose gradient centrifugation is further utilized to isolate RNCs.
The isolated RNCs can be used to analyze structural and functional features of the co-translational folding intermediates. Recently, this technique has been successfully used to gain insights into the structure of several ribosome bound nascent chains10,11
. Here we describe the isolation of bovine Gamma-B Crystallin RNCs fused to SecM and generated in an in vitro
Molecular Biology, Issue 64, Ribosome, nascent polypeptides, co-translational protein folding, translational arrest, in vitro translation
Affinity Purification of Influenza Virus Ribonucleoprotein Complexes from the Chromatin of Infected Cells
Institutions: Universitätsklinikum Freiburg.
Like all negative-strand RNA viruses, the genome of influenza viruses is packaged in the form of viral ribonucleoprotein complexes (vRNP), in which the single-stranded genome is encapsidated by the nucleoprotein (NP), and associated with the trimeric polymerase complex consisting of the PA, PB1, and PB2 subunits. However, in contrast to most RNA viruses, influenza viruses perform viral RNA synthesis in the nuclei of infected cells. Interestingly, viral mRNA synthesis uses cellular pre-mRNAs as primers, and it has been proposed that this process takes place on chromatin1
. Interactions between the viral polymerase and the host RNA polymerase II, as well as between NP and host nucleosomes have also been characterized1,2
Recently, the generation of recombinant influenza viruses encoding a One-Strep-Tag genetically fused to the C-terminus of the PB2 subunit of the viral polymerase (rWSN-PB2-Strep3
) has been described. These recombinant viruses allow the purification of PB2-containing complexes, including vRNPs, from infected cells. To obtain purified vRNPs, cell cultures are infected, and vRNPs are affinity purified from lysates derived from these cells. However, the lysis procedures used to date have been based on one-step detergent lysis, which, despite the presence of a general nuclease, often extract chromatin-bound material only inefficiently.
Our preliminary work suggested that a large portion of nuclear vRNPs were not extracted during traditional cell lysis, and therefore could not be affinity purified. To increase this extraction efficiency, and to separate chromatin-bound from non-chromatin-bound nuclear vRNPs, we adapted a step-wise subcellular extraction protocol to influenza virus-infected cells. Briefly, this procedure first separates the nuclei from the cell and then extracts soluble nuclear proteins (here termed the "nucleoplasmic" fraction). The remaining insoluble nuclear material is then digested with Benzonase, an unspecific DNA/RNA nuclease, followed by two salt extraction steps: first using 150 mM NaCl (termed "ch150"), then 500 mM NaCl ("ch500") (Fig. 1
). These salt extraction steps were chosen based on our observation that 500 mM NaCl was sufficient to solubilize over 85% of nuclear vRNPs yet still allow binding of tagged vRNPs to the affinity matrix.
After subcellular fractionation of infected cells, it is possible to affinity purify PB2-tagged vRNPs from each individual fraction and analyze their protein and RNA components using Western Blot and primer extension, respectively. Recently, we utilized this method to discover that vRNP export complexes form during late points after infection on the chromatin fraction extracted with 500 mM NaCl (ch500)3
Virology, Issue 64, Immunology, Molecular Biology, Influenza A virus, affinity purification, subcellular fractionation, chromatin, vRNP complexes, polymerase
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
The MultiBac Protein Complex Production Platform at the EMBL
Institutions: EMBL Grenoble Outstation and Unit of Virus Host Cell Interactions (UVHCI) UMR5322.
Proteomics research revealed the impressive complexity of eukaryotic proteomes in unprecedented detail. It is now a commonly accepted notion that proteins in cells mostly exist not as isolated entities but exert their biological activity in association with many other proteins, in humans ten or more, forming assembly lines in the cell for most if not all vital functions.1,2
Knowledge of the function and architecture of these multiprotein assemblies requires their provision in superior quality and sufficient quantity for detailed analysis. The paucity of many protein complexes in cells, in particular in eukaryotes, prohibits their extraction from native sources, and necessitates recombinant production. The baculovirus expression vector system (BEVS) has proven to be particularly useful for producing eukaryotic proteins, the activity of which often relies on post-translational processing that other commonly used expression systems often cannot support.3
BEVS use a recombinant baculovirus into which the gene of interest was inserted to infect insect cell cultures which in turn produce the protein of choice. MultiBac is a BEVS that has been particularly tailored for the production of eukaryotic protein complexes that contain many subunits.4
A vital prerequisite for efficient production of proteins and their complexes are robust protocols for all steps involved in an expression experiment that ideally can be implemented as standard operating procedures (SOPs) and followed also by non-specialist users with comparative ease. The MultiBac platform at the European Molecular Biology Laboratory (EMBL) uses SOPs for all steps involved in a multiprotein complex expression experiment, starting from insertion of the genes into an engineered baculoviral genome optimized for heterologous protein production properties to small-scale analysis of the protein specimens produced.5-8
The platform is installed in an open-access mode at EMBL Grenoble and has supported many scientists from academia and industry to accelerate protein complex research projects.
Molecular Biology, Issue 77, Genetics, Bioengineering, Virology, Biochemistry, Microbiology, Basic Protocols, Genomics, Proteomics, Automation, Laboratory, Biotechnology, Multiprotein Complexes, Biological Science Disciplines, Robotics, Protein complexes, multigene delivery, recombinant expression, baculovirus system, MultiBac platform, standard operating procedures (SOP), cell, culture, DNA, RNA, protein, production, sequencing
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at https://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
A Rapid High-throughput Method for Mapping Ribonucleoproteins (RNPs) on Human pre-mRNA
Institutions: Brown University, Brown University.
Sequencing RNAs that co-immunoprecipitate (co-IP) with RNA binding proteins has increased our understanding of splicing by demonstrating that binding location often influences function of a splicing factor. However, as with any sampling strategy the chance of identifying an RNA bound to a splicing factor is proportional to its cellular abundance. We have developed a novel in vitro approach for surveying binding specificity on otherwise transient pre-mRNA. This approach utilizes a specifically designed oligonucleotide pool that tiles across introns, exons, splice junctions, or other pre-mRNA. The pool is subjected to some kind of molecular selection. Here, we demonstrate the method by separating the oligonucleotide into a bound and unbound fraction and utilize a two color array strategy to record the enrichment of each oligonucleotide in the bound fraction. The array data generates high-resolution maps with the ability to identify sequence-specific and structural determinates of ribonucleoprotein (RNP) binding on pre-mRNA. A unique advantage to this method is its ability to avoid the sampling bias towards mRNA associated with current IP and SELEX techniques, as the pool is specifically designed and synthesized from pre-mRNA sequence. The flexibility of the oligonucleotide pool is another advantage since the experimenter chooses which regions to study and tile across, tailoring the pool to their individual needs. Using this technique, one can assay the effects of polymorphisms or mutations on binding on a large scale or clone the library into a functional splicing reporter and identify oligonucleotides that are enriched in the included fraction. This novel in vitro high-resolution mapping scheme provides a unique way to study RNP interactions with transient pre-mRNA species, whose low abundance makes them difficult to study with current in vivo techniques.
Cellular Biology, Issue 34, pre-mRNA, splicing factors, tiling array, ribonucleoprotein (RNP), binding maps
Interview: Protein Folding and Studies of Neurodegenerative Diseases
Institutions: MIT - Massachusetts Institute of Technology.
In this interview, Dr. Lindquist describes relationships between protein folding, prion diseases and neurodegenerative disorders. The problem of the protein folding is at the core of the modern biology. In addition to their traditional biochemical functions, proteins can mediate transfer of biological information and therefore can be considered a genetic material. This recently discovered function of proteins has important implications for studies of human disorders. Dr. Lindquist also describes current experimental approaches to investigate the mechanism of neurodegenerative diseases based on genetic studies in model organisms.
Neuroscience, issue 17, protein folding, brain, neuron, prion, neurodegenerative disease, yeast, screen, Translational Research
RNA Extraction from Neuroprecursor Cells Using the Bio-Rad Total RNA Kit
Institutions: University of California, Irvine (UCI), University of California, Irvine (UCI).
Basic Protocols, Issue 9, RNA, Purification, Brain