In vivo methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris Hildenborough.
22 Related JoVE Articles!
Fluorescence Based Primer Extension Technique to Determine Transcriptional Starting Points and Cleavage Sites of RNases In Vivo
Institutions: University of Tübingen.
Fluorescence based primer extension (FPE) is a molecular method to determine transcriptional starting points or processing sites of RNA molecules. This is achieved by reverse transcription of the RNA of interest using specific fluorescently labeled primers and subsequent analysis of the resulting cDNA fragments by denaturing polyacrylamide gel electrophoresis. Simultaneously, a traditional Sanger sequencing reaction is run on the gel to map the ends of the cDNA fragments to their exact corresponding bases. In contrast to 5'-RACE (Rapid Amplification of cDNA Ends), where the product must be cloned and multiple candidates sequenced, the bulk of cDNA fragments generated by primer extension can be simultaneously detected in one gel run. In addition, the whole procedure (from reverse transcription to final analysis of the results) can be completed in one working day. By using fluorescently labeled primers, the use of hazardous radioactive isotope labeled reagents can be avoided and processing times are reduced as products can be detected during the electrophoresis procedure.
In the following protocol, we describe an in vivo
fluorescent primer extension method to reliably and rapidly detect the 5' ends of RNAs to deduce transcriptional starting points and RNA processing sites (e.g.,
by toxin-antitoxin system components) in S. aureus, E. coli
and other bacteria.
Molecular Biology, Issue 92, Primer extension, RNA mapping, 5' end, fluorescent primer, transcriptional starting point, TSP, RNase, toxin-antitoxin, cleavage site, gel electrophoresis, DNA isolation, RNA processing
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Genetic Manipulation in Δku80 Strains for Functional Genomic Analysis of Toxoplasma gondii
Institutions: The Geisel School of Medicine at Dartmouth.
Targeted genetic manipulation using homologous recombination is the method of choice for functional genomic analysis to obtain a detailed view of gene function and phenotype(s). The development of mutant strains with targeted gene deletions, targeted mutations, complemented gene function, and/or tagged genes provides powerful strategies to address gene function, particularly if these genetic manipulations can be efficiently targeted to the gene locus of interest using integration mediated by double cross over homologous recombination.
Due to very high rates of nonhomologous recombination, functional genomic analysis of Toxoplasma gondii
has been previously limited by the absence of efficient methods for targeting gene deletions and gene replacements to specific genetic loci. Recently, we abolished the major pathway of nonhomologous recombination in type I and type II strains of T. gondii
by deleting the gene encoding the KU80 protein1,2
. The Δku80
strains behave normally during tachyzoite (acute) and bradyzoite (chronic) stages in vitro
and in vivo
and exhibit essentially a 100% frequency of homologous recombination. The Δku80
strains make functional genomic studies feasible on the single gene as well as on the genome scale1-4
Here, we report methods for using type I and type II Δku80Δhxgprt
strains to advance gene targeting approaches in T. gondii
. We outline efficient methods for generating gene deletions, gene replacements, and tagged genes by targeted insertion or deletion of the hypoxanthine-xanthine-guanine phosphoribosyltransferase (HXGPRT
) selectable marker. The described gene targeting protocol can be used in a variety of ways in Δku80
strains to advance functional analysis of the parasite genome and to develop single strains that carry multiple targeted genetic manipulations. The application of this genetic method and subsequent phenotypic assays will reveal fundamental and unique aspects of the biology of T. gondii
and related significant human pathogens that cause malaria (Plasmodium
sp.) and cryptosporidiosis (Cryptosporidium
Infectious Diseases, Issue 77, Genetics, Microbiology, Infection, Medicine, Immunology, Molecular Biology, Cellular Biology, Biomedical Engineering, Bioengineering, Genomics, Parasitology, Pathology, Apicomplexa, Coccidia, Toxoplasma, Genetic Techniques, Gene Targeting, Eukaryota, Toxoplasma gondii, genetic manipulation, gene targeting, gene deletion, gene replacement, gene tagging, homologous recombination, DNA, sequencing
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Protocols for Implementing an Escherichia coli Based TX-TL Cell-Free Expression System for Synthetic Biology
Institutions: California Institute of Technology, California Institute of Technology, Massachusetts Institute of Technology, University of Minnesota.
Ideal cell-free expression systems can theoretically emulate an in vivo
cellular environment in a controlled in vitro
This is useful for expressing proteins and genetic circuits in a controlled manner as well as for providing a prototyping environment for synthetic biology.2,3
To achieve the latter goal, cell-free expression systems that preserve endogenous Escherichia coli transcription-translation mechanisms are able to more accurately reflect in vivo
cellular dynamics than those based on T7 RNA polymerase transcription. We describe the preparation and execution of an efficient endogenous E. coli
based transcription-translation (TX-TL) cell-free expression system that can produce equivalent amounts of protein as T7-based systems at a 98% cost reduction to similar commercial systems.4,5
The preparation of buffers and crude cell extract are described, as well as the execution of a three tube TX-TL reaction. The entire protocol takes five days to prepare and yields enough material for up to 3000 single reactions in one preparation. Once prepared, each reaction takes under 8 hr from setup to data collection and analysis. Mechanisms of regulation and transcription exogenous to E. coli
, such as lac/tet repressors and T7 RNA polymerase, can be supplemented.6
Endogenous properties, such as mRNA and DNA degradation rates, can also be adjusted.7
The TX-TL cell-free expression system has been demonstrated for large-scale circuit assembly, exploring biological phenomena, and expression of proteins under both T7- and endogenous promoters.6,8
Accompanying mathematical models are available.9,10
The resulting system has unique applications in synthetic biology as a prototyping environment, or "TX-TL biomolecular breadboard."
Cellular Biology, Issue 79, Bioengineering, Synthetic Biology, Chemistry Techniques, Synthetic, Molecular Biology, control theory, TX-TL, cell-free expression, in vitro, transcription-translation, cell-free protein synthesis, synthetic biology, systems biology, Escherichia coli cell extract, biological circuits, biomolecular breadboard
Mouse Genome Engineering Using Designer Nucleases
Institutions: University of Zurich, University of Minnesota.
Transgenic mice carrying site-specific genome modifications (knockout, knock-in) are of vital importance for dissecting complex biological systems as well as for modeling human diseases and testing therapeutic strategies. Recent advances in the use of designer nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system for site-specific genome engineering open the possibility to perform rapid targeted genome modification in virtually any laboratory species without the need to rely on embryonic stem (ES) cell technology. A genome editing experiment typically starts with identification of designer nuclease target sites within a gene of interest followed by construction of custom DNA-binding domains to direct nuclease activity to the investigator-defined genomic locus. Designer nuclease plasmids are in vitro
transcribed to generate mRNA for microinjection of fertilized mouse oocytes. Here, we provide a protocol for achieving targeted genome modification by direct injection of TALEN mRNA into fertilized mouse oocytes.
Genetics, Issue 86, Oocyte microinjection, Designer nucleases, ZFN, TALEN, Genome Engineering
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Analysis of RNA Processing Reactions Using Cell Free Systems: 3' End Cleavage of Pre-mRNA Substrates in vitro
Institutions: The Scripps Research Institute, City College of New York.
The 3’ end of mammalian mRNAs is not formed by abrupt termination of transcription by RNA polymerase II (RNPII). Instead, RNPII synthesizes precursor mRNA beyond the end of mature RNAs, and an active process of endonuclease activity is required at a specific site. Cleavage of the precursor RNA normally occurs 10-30 nt downstream from the consensus polyA site (AAUAAA) after the CA dinucleotides. Proteins from the cleavage complex, a multifactorial protein complex of approximately 800 kDa, accomplish this specific nuclease activity. Specific RNA sequences upstream and downstream of the polyA site control the recruitment of the cleavage complex. Immediately after cleavage, pre-mRNAs are polyadenylated by the polyA polymerase (PAP) to produce mature stable RNA messages.
Processing of the 3’ end of an RNA transcript may be studied using cellular nuclear extracts with specific radiolabeled RNA substrates. In sum, a long 32
P-labeled uncleaved precursor RNA is incubated with nuclear extracts in vitro
, and cleavage is assessed by gel electrophoresis and autoradiography. When proper cleavage occurs, a shorter 5’ cleaved product is detected and quantified. Here, we describe the cleavage assay in detail using, as an example, the 3’ end processing of HIV-1 mRNAs.
Infectious Diseases, Issue 87, Cleavage, Polyadenylation, mRNA processing, Nuclear extracts, 3' Processing Complex
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
High-throughput Functional Screening using a Homemade Dual-glow Luciferase Assay
Institutions: Massachusetts General Hospital.
We present a rapid and inexpensive high-throughput screening protocol to identify transcriptional regulators of alpha-synuclein, a gene associated with Parkinson's disease. 293T cells are transiently transfected with plasmids from an arrayed ORF expression library, together with luciferase reporter plasmids, in a one-gene-per-well microplate format. Firefly luciferase activity is assayed after 48 hr to determine the effects of each library gene upon alpha-synuclein transcription, normalized to expression from an internal control construct (a hCMV promoter directing Renilla
luciferase). This protocol is facilitated by a bench-top robot enclosed in a biosafety cabinet, which performs aseptic liquid handling in 96-well format. Our automated transfection protocol is readily adaptable to high-throughput lentiviral library production or other functional screening protocols requiring triple-transfections of large numbers of unique library plasmids in conjunction with a common set of helper plasmids. We also present an inexpensive and validated alternative to commercially-available, dual luciferase reagents which employs PTC124, EDTA, and pyrophosphate to suppress firefly luciferase activity prior to measurement of Renilla
luciferase. Using these methods, we screened 7,670 human genes and identified 68 regulators of alpha-synuclein. This protocol is easily modifiable to target other genes of interest.
Cellular Biology, Issue 88, Luciferases, Gene Transfer Techniques, Transfection, High-Throughput Screening Assays, Transfections, Robotics
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Vaccinia Virus Infection & Temporal Analysis of Virus Gene Expression: Part 3
Institutions: MIT - Massachusetts Institute of Technology.
The family Poxviridae
consists of large double-stranded DNA containing viruses that replicate exclusively in the cytoplasm of infected cells. Members of the orthopox
genus include variola, the causative agent of human small pox, monkeypox, and vaccinia (VAC), the prototypic member of the virus family. Within the relatively large (~ 200 kb) vaccinia genome, three classes of genes are encoded: early, intermediate, and late. While all three classes are transcribed by virally-encoded RNA polymerases, each class serves a different function in the life cycle of the virus. Poxviruses utilize multiple strategies for modulation of the host cellular environment during infection. In order to understand regulation of both host and virus gene expression, we have utilized genome-wide approaches to analyze transcript abundance from both virus and host cells. Here, we demonstrate time course infections of HeLa cells with Vaccinia virus and sampling RNA at several time points post-infection. Both host and viral total RNA is isolated and amplified for hybridization to microarrays for analysis of gene expression.
Microbiology, Issue 26, Vaccinia, virus, infection, HeLa, Microarray, amplified RNA, amino allyl, RNA, Ambion Amino Allyl MessageAmpII, gene expression
Direct Restart of a Replication Fork Stalled by a Head-On RNA Polymerase
Institutions: Rockefeller University.
studies suggest that replication forks are arrested due to encounters with head-on transcription complexes. Yet, the fate of the replisome and RNA polymerase (RNAP) following a head-on collision is unknown. Here, we find that the E. coli
replisome stalls upon collision with a head-on transcription complex, but instead of collapsing, the replication fork remains highly stable and eventually resumes elongation after displacing the RNAP from DNA. We also find that the transcription-repair coupling factor, Mfd, promotes direct restart of the fork following the collision by facilitating displacement of the RNAP. These findings demonstrate the intrinsic stability of the replication apparatus and a novel role for the transcription-coupled repair pathway in promoting replication past a RNAP block.
Cellular Biology, Issue 38, replication, transcription, transcription-coupled repair, replisome, RNA polymerase, collision
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
A Toolkit to Enable Hydrocarbon Conversion in Aqueous Environments
Institutions: Delft University of Technology, Delft University of Technology.
This work puts forward a toolkit that enables the conversion of alkanes by Escherichia coli
and presents a proof of principle of its applicability. The toolkit consists of multiple standard interchangeable parts (BioBricks)9
addressing the conversion of alkanes, regulation of gene expression and survival in toxic hydrocarbon-rich environments.
A three-step pathway for alkane degradation was implemented in E. coli
to enable the conversion of medium- and long-chain alkanes to their respective alkanols, alkanals and ultimately alkanoic-acids. The latter were metabolized via the native β-oxidation pathway. To facilitate the oxidation of medium-chain alkanes (C5-C13) and cycloalkanes (C5-C8), four genes (alkB2
) of the alkane hydroxylase system from Gordonia
were transformed into E. coli
. For the conversion of long-chain alkanes (C15-C36), theladA
gene from Geobacillus thermodenitrificans
was implemented. For the required further steps of the degradation process, ADH
and ALDH (
originating from G. thermodenitrificans
) were introduced10,11
. The activity was measured by resting cell assays. For each oxidative step, enzyme activity was observed.
To optimize the process efficiency, the expression was only induced under low glucose conditions: a substrate-regulated promoter, pCaiF, was used. pCaiF is present in E. coli
K12 and regulates the expression of the genes involved in the degradation of non-glucose carbon sources.
The last part of the toolkit - targeting survival - was implemented using solvent tolerance genes, PhPFDα and β, both from Pyrococcus horikoshii
OT3. Organic solvents can induce cell stress and decreased survivability by negatively affecting protein folding. As chaperones, PhPFDα and β improve the protein folding process e.g.
under the presence of alkanes. The expression of these genes led to an improved hydrocarbon tolerance shown by an increased growth rate (up to 50%) in the presences of 10% n
-hexane in the culture medium were observed.
Summarizing, the results indicate that the toolkit enables E. coli
to convert and tolerate hydrocarbons in aqueous environments. As such, it represents an initial step towards a sustainable solution for oil-remediation using a synthetic biology approach.
Bioengineering, Issue 68, Microbiology, Biochemistry, Chemistry, Chemical Engineering, Oil remediation, alkane metabolism, alkane hydroxylase system, resting cell assay, prefoldin, Escherichia coli, synthetic biology, homologous interaction mapping, mathematical model, BioBrick, iGEM
Substrate Generation for Endonucleases of CRISPR/Cas Systems
Institutions: Max-Planck-Institute for Terrestrial Microbiology.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1) 1-3
. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4
. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5
. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8
. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9
. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10
. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro
T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology, Issue 67, CRISPR/Cas, endonuclease, in vitro transcription, crRNA, Cas6
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Principles of Site-Specific Recombinase (SSR) Technology
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Site-specific recombinase (SSR) technology allows the manipulation of gene structure to explore gene function and has become an integral tool of molecular biology. Site-specific recombinases are proteins that bind to distinct DNA target sequences. The Cre/lox system was first described in bacteriophages during the 1980's. Cre recombinase is a Type I topoisomerase that catalyzes site-specific recombination of DNA between two loxP (locus of X-over P1) sites. The Cre/lox system does not require any cofactors. LoxP sequences contain distinct binding sites for Cre recombinases that surround a directional core sequence where recombination and rearrangement takes place. When cells contain loxP sites and express the Cre recombinase, a recombination event occurs. Double-stranded DNA is cut at both loxP sites by the Cre recombinase, rearranged, and ligated ("scissors and glue"). Products of the recombination event depend on the relative orientation of the asymmetric sequences.
SSR technology is frequently used as a tool to explore gene function. Here the gene of interest is flanked with Cre target sites loxP ("floxed"). Animals are then crossed with animals expressing the Cre recombinase under the control of a tissue-specific promoter. In tissues that express the Cre recombinase it binds to target sequences and excises the floxed gene. Controlled gene deletion allows the investigation of gene function in specific tissues and at distinct time points. Analysis of gene function employing SSR technology --- conditional mutagenesis -- has significant advantages over traditional knock-outs where gene deletion is frequently lethal.
Cellular Biology, Issue 15, Molecular Biology, Site-Specific Recombinase, Cre recombinase, Cre/lox system, transgenic animals, transgenic technology