Phenotypes are determined by a complex series of physical (e.g. protein-protein) and functional (e.g. gene-gene or genetic) interactions (GI)1. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7, but GI information remains sparse for prokaryotes8, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10.
Here, we present the key steps required to perform quantitative E. coli Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format. Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g. the 'Keio' collection11) and essential gene hypomorphic mutations (i.e. alleles conferring reduced protein expression, stability, or activity9, 12, 13) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e. slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2 as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9.
22 Related JoVE Articles!
Identifying Targets of Human microRNAs with the LightSwitch Luciferase Assay System using 3'UTR-reporter Constructs and a microRNA Mimic in Adherent Cells
Institutions: SwitchGear Genomics.
MicroRNAs (miRNAs) are important regulators of gene expression and play a role in many biological processes. More than 700 human miRNAs have been identified so far with each having up to hundreds of unique target mRNAs. Computational tools, expression and proteomics assays, and chromatin-immunoprecipitation-based techniques provide important clues for identifying mRNAs that are direct targets of a particular miRNA. In addition, 3'UTR-reporter assays have become an important component of thorough miRNA target studies because they provide functional evidence for and quantitate the effects of specific miRNA-3'UTR interactions in a cell-based system. To enable more researchers to leverage 3'UTR-reporter assays and to support the scale-up of such assays to high-throughput levels, we have created a genome-wide collection of human 3'UTR luciferase reporters in the highly-optimized LightSwitch Luciferase Assay System. The system also includes synthetic miRNA target reporter constructs for use as positive controls, various endogenous 3'UTR reporter constructs, and a series of standardized experimental protocols.
Here we describe a method for co-transfection of individual 3'UTR-reporter constructs along with a miRNA mimic that is efficient, reproducible, and amenable to high-throughput analysis.
Genetics, Issue 55, MicroRNA, miRNA, mimic, Clone, 3' UTR, Assay, vector, LightSwitch, luciferase, co-transfection, 3'UTR REPORTER, mirna target, microrna target, reporter, GoClone, Reporter construct
Mosaic Zebrafish Transgenesis for Evaluating Enhancer Sequences
Institutions: University of Pennsylvania .
The completion of the human genome sequence, along with that of many other species, has highlighted the challenge of ascribing specific function to non coding sequences. One prominent function carried out by the non coding fraction of the genome is to regulate gene transcription; however, there are no effective methods to broadly predict cis-regulatory elements from primary DNA sequence. We have developed an efficient protocol to functionally evaluate potential cis-regulatory elements through zebrafish transgenesis. Our approach offers significant advantages over cell-culture based techniques for developmentally important genes, since it provides information on spatial and temporal gene regulation. Conversely, it is faster and less expensive than similar experiments in transgenic mice, and we routinely apply it to sequences isolated from the human genome. Here we demonstrate our approach to selecting elements for testing based on sequence conservation and our protocol for cloning sequences and microinjecting them into zebrafish embryos.
Cellular Biology, Issue 41, zebrafish, transgenesis, microinjection, GFP, enhancers, transposon
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Characterization of G Protein-coupled Receptors by a Fluorescence-based Calcium Mobilization Assay
Institutions: KU Leuven.
For more than 20 years, reverse pharmacology has been the preeminent strategy to discover the activating ligands of orphan G protein-coupled receptors (GPCRs). The onset of a reverse pharmacology assay is the cloning and subsequent transfection of a GPCR of interest in a cellular expression system. The heterologous expressed receptor is then challenged with a compound library of candidate ligands to identify the receptor-activating ligand(s). Receptor activation can be assessed by measuring changes in concentration of second messenger reporter molecules, like calcium or cAMP. The fluorescence-based calcium mobilization assay described here is a frequently used medium-throughput reverse pharmacology assay. The orphan GPCR is transiently expressed in human embryonic kidney 293T (HEK293T) cells and a promiscuous Gα16
construct is co-transfected. Following ligand binding, activation of the Gα16
subunit induces the release of calcium from the endoplasmic reticulum. Prior to ligand screening, the receptor-expressing cells are loaded with a fluorescent calcium indicator, Fluo-4 acetoxymethyl. The fluorescent signal of Fluo-4 is negligible in cells under resting conditions, but can be amplified more than a 100-fold upon the interaction with calcium ions that are released after receptor activation. The described technique does not require the time-consuming establishment of stably transfected cell lines in which the transfected genetic material is integrated into the host cell genome. Instead, a transient transfection, generating temporary expression of the target gene, is sufficient to perform the screening assay. The setup allows medium-throughput screening of hundreds of compounds. Co-transfection of the promiscuous Gα16
, which couples to most GPCRs, allows the intracellular signaling pathway to be redirected towards the release of calcium, regardless of the native signaling pathway in endogenous settings. The HEK293T cells are easy to handle and have proven their efficacy throughout the years in receptor deorphanization assays. However, optimization of the assay for specific receptors may remain necessary.
Cellular Biology, Issue 89, G protein-coupled receptor (GPCR), calcium mobilization assay, reverse pharmacology, deorphanization, cellular expression system, HEK293T, Fluo-4, FlexStation
Whole Mount RNA Fluorescent in situ Hybridization of Drosophila Embryos
Institutions: Institut de Recherches Cliniques de Montréal (IRCM), Université de Montréal.
Assessing the expression pattern of a gene, as well as the subcellular localization properties of its transcribed RNA, are key features for understanding its biological function during development. RNA in situ
hybridization (RNA-ISH) is a powerful method used for visualizing RNA distribution properties, be it at the organismal, cellular or subcellular levels 1
. RNA-ISH is based on the hybridization of a labeled nucleic acid probe (e.g.
antisense RNA, oligonucleotides) complementary to the sequence of an mRNA or a non-coding RNA target of interest 2
. As the procedure requires primary sequence information alone to generate sequence-specific probes, it can be universally applied to a broad range of organisms and tissue specimens 3
. Indeed, a number of large-scale ISH studies have been implemented to document gene expression and RNA localization dynamics in various model organisms, which has led to the establishment of important community resources 4-11
. While a variety of probe labeling and detection strategies have been developed over the years, the combined usage of fluorescently-labeled detection reagents and enzymatic signal amplification steps offer significant enhancements in the sensitivity and resolution of the procedure 12
. Here, we describe an optimized fluorescent in situ
hybridization method (FISH) employing tyramide signal amplification (TSA) to visualize RNA expression and localization dynamics in staged Drosophila
embryos. The procedure is carried out in 96-well PCR plate format, which greatly facilitates the simultaneous processing of large numbers of samples.
Developmental Biology, Issue 71, Cellular Biology, Molecular Biology, Genetics, Genomics, Drosophila, Embryo, Fluorescent in situ hybridization, FISH, Gene Expression Pattern, RNA Localization, RNA, Tyramide Signal Amplification, TSA, knockout, fruit fly, whole mount, embryogenesis, animal model
Identifying Protein-protein Interaction in Drosophila Adult Heads by Tandem Affinity Purification (TAP)
Institutions: Louisiana State University Health Sciences Center.
Genetic screens conducted using Drosophila melanogaster
(fruit fly) have made numerous milestone discoveries in the advance of biological sciences. However, the use of biochemical screens aimed at extending the knowledge gained from genetic analysis was explored only recently. Here we describe a method to purify the protein complex that associates with any protein of interest from adult fly heads. This method takes advantage of the Drosophila
GAL4/UAS system to express a bait protein fused with a Tandem Affinity Purification (TAP) tag in fly neurons in vivo
, and then implements two rounds of purification using a TAP procedure similar to the one originally established in yeast1
to purify the interacting protein complex. At the end of this procedure, a mixture of multiple protein complexes is obtained whose molecular identities can be determined by mass spectrometry. Validation of the candidate proteins will benefit from the resource and ease of performing loss-of-function studies in flies. Similar approaches can be applied to other fly tissues. We believe that the combination of genetic manipulations and this proteomic approach in the fly model system holds tremendous potential for tackling fundamental problems in the field of neurobiology and beyond.
Biochemistry, Issue 82, Drosophila, GAL4/UAS system, transgenic, Tandem Affinity Purification, protein-protein interaction, proteomics
Investigating Protein-protein Interactions in Live Cells Using Bioluminescence Resonance Energy Transfer
Institutions: Max Planck Institute for Psycholinguistics, Donders Institute for Brain, Cognition and Behaviour.
Assays based on Bioluminescence Resonance Energy Transfer (BRET) provide a sensitive and reliable means to monitor protein-protein interactions in live cells. BRET is the non-radiative transfer of energy from a 'donor' luciferase enzyme to an 'acceptor' fluorescent protein. In the most common configuration of this assay, the donor is Renilla reniformis
luciferase and the acceptor is Yellow Fluorescent Protein (YFP). Because the efficiency of energy transfer is strongly distance-dependent, observation of the BRET phenomenon requires that the donor and acceptor be in close proximity. To test for an interaction between two proteins of interest in cultured mammalian cells, one protein is expressed as a fusion with luciferase and the second as a fusion with YFP. An interaction between the two proteins of interest may bring the donor and acceptor sufficiently close for energy transfer to occur. Compared to other techniques for investigating protein-protein interactions, the BRET assay is sensitive, requires little hands-on time and few reagents, and is able to detect interactions which are weak, transient, or dependent on the biochemical environment found within a live cell. It is therefore an ideal approach for confirming putative interactions suggested by yeast two-hybrid or mass spectrometry proteomics studies, and in addition it is well-suited for mapping interacting regions, assessing the effect of post-translational modifications on protein-protein interactions, and evaluating the impact of mutations identified in patient DNA.
Cellular Biology, Issue 87, Protein-protein interactions, Bioluminescence Resonance Energy Transfer, Live cell, Transfection, Luciferase, Yellow Fluorescent Protein, Mutations
Biochemical and High Throughput Microscopic Assessment of Fat Mass in Caenorhabditis Elegans
Institutions: Massachusetts General Hospital and Harvard Medical School, Massachusetts Institute of Technology.
The nematode C. elegans
has emerged as an important model for the study of conserved genetic pathways regulating fat metabolism as it relates to human obesity and its associated pathologies. Several previous methodologies developed for the visualization of C. elegans
triglyceride-rich fat stores have proven to be erroneous, highlighting cellular compartments other than lipid droplets. Other methods require specialized equipment, are time-consuming, or yield inconsistent results. We introduce a rapid, reproducible, fixative-based Nile red staining method for the accurate and rapid detection of neutral lipid droplets in C. elegans
. A short fixation step in 40% isopropanol makes animals completely permeable to Nile red, which is then used to stain animals. Spectral properties of this lipophilic dye allow it to strongly and selectively fluoresce in the yellow-green spectrum only when in a lipid-rich environment, but not in more polar environments. Thus, lipid droplets can be visualized on a fluorescent microscope equipped with simple GFP imaging capability after only a brief Nile red staining step in isopropanol. The speed, affordability, and reproducibility of this protocol make it ideally suited for high throughput screens. We also demonstrate a paired method for the biochemical determination of triglycerides and phospholipids using gas chromatography mass-spectrometry. This more rigorous protocol should be used as confirmation of results obtained from the Nile red microscopic lipid determination. We anticipate that these techniques will become new standards in the field of C. elegans
Genetics, Issue 73, Biochemistry, Cellular Biology, Molecular Biology, Developmental Biology, Physiology, Anatomy, Caenorhabditis elegans, Obesity, Energy Metabolism, Lipid Metabolism, C. elegans, fluorescent lipid staining, lipids, Nile red, fat, high throughput screening, obesity, gas chromatography, mass spectrometry, GC/MS, animal model
DNA-affinity-purified Chip (DAP-chip) Method to Determine Gene Targets for Bacterial Two component Regulatory Systems
Institutions: Lawrence Berkeley National Laboratory.
methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro
microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris
Genetics, Issue 89, DNA-Affinity-Purified-chip, response regulator, transcription factor binding site, two component system, signal transduction, Desulfovibrio, lactate utilization regulator, ChIP-chip
Simultaneous Multicolor Imaging of Biological Structures with Fluorescence Photoactivation Localization Microscopy
Institutions: University of Maine.
Localization-based super resolution microscopy can be applied to obtain a spatial map (image) of the distribution of individual fluorescently labeled single molecules within a sample with a spatial resolution of tens of nanometers. Using either photoactivatable (PAFP) or photoswitchable (PSFP) fluorescent proteins fused to proteins of interest, or organic dyes conjugated to antibodies or other molecules of interest, fluorescence photoactivation localization microscopy (FPALM) can simultaneously image multiple species of molecules within single cells. By using the following approach, populations of large numbers (thousands to hundreds of thousands) of individual molecules are imaged in single cells and localized with a precision of ~10-30 nm. Data obtained can be applied to understanding the nanoscale spatial distributions of multiple protein types within a cell. One primary advantage of this technique is the dramatic increase in spatial resolution: while diffraction limits resolution to ~200-250 nm in conventional light microscopy, FPALM can image length scales more than an order of magnitude smaller. As many biological hypotheses concern the spatial relationships among different biomolecules, the improved resolution of FPALM can provide insight into questions of cellular organization which have previously been inaccessible to conventional fluorescence microscopy. In addition to detailing the methods for sample preparation and data acquisition, we here describe the optical setup for FPALM. One additional consideration for researchers wishing to do super-resolution microscopy is cost: in-house setups are significantly cheaper than most commercially available imaging machines. Limitations of this technique include the need for optimizing the labeling of molecules of interest within cell samples, and the need for post-processing software to visualize results. We here describe the use of PAFP and PSFP expression to image two protein species in fixed cells. Extension of the technique to living cells is also described.
Basic Protocol, Issue 82, Microscopy, Super-resolution imaging, Multicolor, single molecule, FPALM, Localization microscopy, fluorescent proteins
Polysome Fractionation and Analysis of Mammalian Translatomes on a Genome-wide Scale
Institutions: McGill University, Karolinska Institutet, McGill University.
mRNA translation plays a central role in the regulation of gene expression and represents the most energy consuming process in mammalian cells. Accordingly, dysregulation of mRNA translation is considered to play a major role in a variety of pathological states including cancer. Ribosomes also host chaperones, which facilitate folding of nascent polypeptides, thereby modulating function and stability of newly synthesized polypeptides. In addition, emerging data indicate that ribosomes serve as a platform for a repertoire of signaling molecules, which are implicated in a variety of post-translational modifications of newly synthesized polypeptides as they emerge from the ribosome, and/or components of translational machinery. Herein, a well-established method of ribosome fractionation using sucrose density gradient centrifugation is described. In conjunction with the in-house developed “anota” algorithm this method allows direct determination of differential translation of individual mRNAs on a genome-wide scale. Moreover, this versatile protocol can be used for a variety of biochemical studies aiming to dissect the function of ribosome-associated protein complexes, including those that play a central role in folding and degradation of newly synthesized polypeptides.
Biochemistry, Issue 87, Cells, Eukaryota, Nutritional and Metabolic Diseases, Neoplasms, Metabolic Phenomena, Cell Physiological Phenomena, mRNA translation, ribosomes,
protein synthesis, genome-wide analysis, translatome, mTOR, eIF4E, 4E-BP1
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Electrophoretic Mobility Shift Assay (EMSA) for the Study of RNA-Protein Interactions: The IRE/IRP Example
Institutions: Jewish General Hospital, McGill University.
RNA/protein interactions are critical for post-transcriptional regulatory pathways. Among the best-characterized cytosolic RNA-binding proteins are iron regulatory proteins
, IRP1 and IRP2. They bind to iron responsive elements (IREs) within the untranslated regions (UTRs) of several target mRNAs, thereby controlling the mRNAs translation or stability. IRE/IRP interactions have been widely studied by EMSA. Here, we describe the EMSA protocol for analyzing the IRE-binding activity of IRP1 and IRP2, which can be generalized to assess the activity of other RNA-binding proteins as well. A crude protein lysate containing an RNA-binding protein, or a purified preparation of this protein, is incubated with an excess of32
P-labeled RNA probe, allowing for complex formation. Heparin is added to preclude non-specific protein to probe binding. Subsequently, the mixture is analyzed by non-denaturing electrophoresis on a polyacrylamide gel. The free probe migrates fast, while the RNA/protein complex exhibits retarded mobility; hence, the procedure is also called “gel retardation” or “bandshift” assay. After completion of the electrophoresis, the gel is dried and RNA/protein complexes, as well as free probe, are detected by autoradiography. The overall goal of the protocol is to detect and quantify IRE/IRP and other RNA/protein interactions. Moreover, EMSA can also be used to determine specificity, binding affinity, and stoichiometry of the RNA/protein interaction under investigation.
Biochemistry, Issue 94, RNA metabolism, mRNA translation, post-transcriptional gene regulation, mRNA stability, IRE, IRP1, IRP2, iron metabolism, ferritin, transferrin receptor
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Analysis of Nephron Composition and Function in the Adult Zebrafish Kidney
Institutions: University of Notre Dame.
The zebrafish model has emerged as a relevant system to study kidney development, regeneration and disease. Both the embryonic and adult zebrafish kidneys are composed of functional units known as nephrons, which are highly conserved with other vertebrates, including mammals. Research in zebrafish has recently demonstrated that two distinctive phenomena transpire after adult nephrons incur damage: first, there is robust regeneration within existing nephrons that replaces the destroyed tubule epithelial cells; second, entirely new nephrons are produced from renal progenitors in a process known as neonephrogenesis. In contrast, humans and other mammals seem to have only a limited ability for nephron epithelial regeneration. To date, the mechanisms responsible for these kidney regeneration phenomena remain poorly understood. Since adult zebrafish kidneys undergo both nephron epithelial regeneration and neonephrogenesis, they provide an outstanding experimental paradigm to study these events. Further, there is a wide range of genetic and pharmacological tools available in the zebrafish model that can be used to delineate the cellular and molecular mechanisms that regulate renal regeneration. One essential aspect of such research is the evaluation of nephron structure and function. This protocol describes a set of labeling techniques that can be used to gauge renal composition and test nephron functionality in the adult zebrafish kidney. Thus, these methods are widely applicable to the future phenotypic characterization of adult zebrafish kidney injury paradigms, which include but are not limited to, nephrotoxicant exposure regimes or genetic methods of targeted cell death such as the nitroreductase mediated cell ablation technique. Further, these methods could be used to study genetic perturbations in adult kidney formation and could also be applied to assess renal status during chronic disease modeling.
Cellular Biology, Issue 90,
zebrafish; kidney; nephron; nephrology; renal; regeneration; proximal tubule; distal tubule; segment; mesonephros; physiology; acute kidney injury (AKI)
Isolation of mRNAs Associated with Yeast Mitochondria to Study Mechanisms of Localized Translation
Institutions: Technion - Israel Institute of Technology.
Most of mitochondrial proteins are encoded in the nucleus and need to be imported into the organelle. Import may occur while the protein is synthesized near the mitochondria. Support for this possibility is derived from recent studies, in which many mRNAs encoding mitochondrial proteins were shown to be localized to the mitochondria vicinity. Together with earlier demonstrations of ribosomes’ association with the outer membrane, these results suggest a localized translation process. Such localized translation may improve import efficiency, provide unique regulation sites and minimize cases of ectopic expression. Diverse methods have been used to characterize the factors and elements that mediate localized translation. Standard among these is subcellular fractionation by differential centrifugation. This protocol has the advantage of isolation of mRNAs, ribosomes and proteins in a single procedure. These can then be characterized by various molecular and biochemical methods. Furthermore, transcriptomics and proteomics methods can be applied to the resulting material, thereby allow genome-wide insights. The utilization of yeast as a model organism for such studies has the advantages of speed, costs and simplicity. Furthermore, the advanced genetic tools and available deletion strains facilitate verification of candidate factors.
Biochemistry, Issue 85, mitochondria, mRNA localization, Yeast, S. cerevisiae, microarray, localized translation, biochemical fractionation
Quantitative Comparison of cis-Regulatory Element (CRE) Activities in Transgenic Drosophila melanogaster
Institutions: University of Dayton, University of Dayton.
Gene expression patterns are specified by cis
-regulatory element (CRE) sequences, which are also called enhancers or cis-regulatory modules. A typical CRE possesses an arrangement of binding sites for several transcription factor proteins that confer a regulatory logic specifying when, where, and at what level the regulated gene(s) is expressed. The full set of CREs within an animal genome encodes the organism′s program for development1
, and empirical as well as theoretical studies indicate that mutations in CREs played a prominent role in morphological evolution2-4
. Moreover, human genome wide association studies indicate that genetic variation in CREs contribute substantially to phenotypic variation5,6
. Thus, understanding regulatory logic and how mutations affect such logic is a central goal of genetics.
Reporter transgenes provide a powerful method to study the in vivo
function of CREs. Here a known or suspected CRE sequence is coupled to heterologous promoter and coding sequences for a reporter gene encoding an easily observable protein product. When a reporter transgene is inserted into a host organism, the CRE′s activity becomes visible in the form of the encoded reporter protein. P-element mediated transgenesis in the fruit fly species Drosophila (D.) melanogaster7
has been used for decades to introduce reporter transgenes into this model organism, though the genomic placement of transgenes is random. Hence, reporter gene activity is strongly influenced by the local chromatin and gene environment, limiting CRE comparisons to being qualitative. In recent years, the phiC31 based integration system was adapted for use in D. melanogaster
to insert transgenes into specific genome landing sites8-10
. This capability has made the quantitative measurement of gene and, relevant here, CRE activity11-13
feasible. The production of transgenic fruit flies can be outsourced, including phiC31-based integration, eliminating the need to purchase expensive equipment and/or have proficiency at specialized transgene injection protocols.
Here, we present a general protocol to quantitatively evaluate a CRE′s activity, and show how this approach can be used to measure the effects of an introduced mutation on a CRE′s activity and to compare the activities of orthologous CREs. Although the examples given are for a CRE active during fruit fly metamorphosis, the approach can be applied to other developmental stages, fruit fly species, or model organisms. Ultimately, a more widespread use of this approach to study CREs should advance an understanding of regulatory logic and how logic can vary and evolve.
Developmental Biology, Issue 58, Cis-regulatory element, CRE, cis-regulatory module, enhancer, site-specific integration, reporter transgenes, confocal microscopy, regulatory logic, transcription factors, binding sites, Drosophila melanogaster, Drosophila
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif