Prototrophic bacteria grow on M-9 minimal salts medium supplemented with glucose (M-9 medium), which is used as a carbon and energy source. Auxotrophs can be generated using a transposome. The commercially available, Tn5-derived transposome used in this protocol consists of a linear segment of DNA containing an R6Kγ replication origin, a gene for kanamycin resistance and two mosaic sequence ends, which serve as transposase binding sites. The transposome, provided as a DNA/transposase protein complex, is introduced by electroporation into the prototrophic strain, Enterobacter sp. YSU, and randomly incorporates itself into this host’s genome. Transformants are replica plated onto Luria-Bertani agar plates containing kanamycin, (LB-kan) and onto M-9 medium agar plates containing kanamycin (M-9-kan). The transformants that grow on LB-kan plates but not on M-9-kan plates are considered to be auxotrophs. Purified genomic DNA from an auxotroph is partially digested, ligated and transformed into a pir+ Escherichia coli (E. coli) strain. The R6Kγ replication origin allows the plasmid to replicate in pir+ E. coli strains, and the kanamycin resistance marker allows for plasmid selection. Each transformant possesses a new plasmid containing the transposon flanked by the interrupted chromosomal region. Sanger sequencing and the Basic Local Alignment Search Tool (BLAST) suggest a putative identity of the interrupted gene. There are three advantages to using this transposome mutagenesis strategy. First, it does not rely on the expression of a transposase gene by the host. Second, the transposome is introduced into the target host by electroporation, rather than by conjugation or by transduction and therefore is more efficient. Third, the R6Kγ replication origin makes it easy to identify the mutated gene which is partially recovered in a recombinant plasmid. This technique can be used to investigate the genes involved in other characteristics of Enterobacter sp. YSU or of a wider variety of bacterial strains.
25 Related JoVE Articles!
The Logic, Experimental Steps, and Potential of Heterologous Natural Product Biosynthesis Featuring the Complex Antibiotic Erythromycin A Produced Through E. coli
Institutions: State University of New York at Buffalo, Massachusetts Institute of Technology.
The heterologous production of complex natural products is an approach designed to address current limitations and future possibilities. It is particularly useful for those compounds which possess therapeutic value but cannot be sufficiently produced or would benefit from an improved form of production. The experimental procedures involved can be subdivided into three components: 1) genetic transfer; 2) heterologous reconstitution; and 3) product analysis. Each experimental component is under continual optimization to meet the challenges and anticipate the opportunities associated with this emerging approach.
Heterologous biosynthesis begins with the identification of a genetic sequence responsible for a valuable natural product. Transferring this sequence to a heterologous host is complicated by the biosynthetic pathway complexity responsible for product formation. The antibiotic erythromycin A is a good example. Twenty genes (totaling >50 kb) are required for eventual biosynthesis. In addition, three of these genes encode megasynthases, multi-domain enzymes each ~300 kDa in size. This genetic material must be designed and transferred to E. coli
for reconstituted biosynthesis. The use of PCR isolation, operon construction, multi-cystronic plasmids, and electro-transformation will be described in transferring the erythromycin A genetic cluster to E. coli
Once transferred, the E. coli
cell must support eventual biosynthesis. This process is also challenging given the substantial differences between E. coli
and most original hosts responsible for complex natural product formation. The cell must provide necessary substrates to support biosynthesis and coordinately express the transferred genetic cluster to produce active enzymes. In the case of erythromycin A, the E. coli
cell had to be engineered to provide the two precursors (propionyl-CoA and (2S)-methylmalonyl-CoA) required for biosynthesis. In addition, gene sequence modifications, plasmid copy number, chaperonin co-expression, post-translational enzymatic modification, and process temperature were also required to allow final erythromycin A formation.
Finally, successful production must be assessed. For the erythromycin A case, we will present two methods. The first is liquid chromatography-mass spectrometry (LC-MS) to confirm and quantify production. The bioactivity of erythromycin A will also be confirmed through use of a bioassay in which the antibiotic activity is tested against Bacillus subtilis
. The assessment assays establish erythromycin A biosynthesis from E. coli
and set the stage for future engineering efforts to improve or diversify production and for the production of new complex natural compounds using this approach.
Biomedical Engineering, Issue 71, Chemical Engineering, Bioengineering, Molecular Biology, Cellular Biology, Microbiology, Basic Protocols, Biochemistry, Biotechnology, Heterologous biosynthesis, natural products, antibiotics, erythromycin A, metabolic engineering, E. coli
Diagnosing Pulmonary Tuberculosis with the Xpert MTB/RIF Test
Institutions: University of Bern, MCL Laboratories Inc..
Tuberculosis (TB) due to Mycobacterium tuberculosis
(MTB) remains a major public health issue: the infection affects up to one third of the world population1
, and almost two million people are killed by TB each year.2
Universal access to high-quality, patient-centered treatment for all TB patients is emphasized by WHO's Stop TB Strategy.3
The rapid detection of MTB in respiratory specimens and drug therapy based on reliable drug resistance testing results are a prerequisite for the successful implementation of this strategy. However, in many areas of the world, TB diagnosis still relies on insensitive, poorly standardized sputum microscopy methods. Ineffective TB detection and the emergence and transmission of drug-resistant MTB strains increasingly jeopardize global TB control activities.2
Effective diagnosis of pulmonary TB requires the availability - on a global scale - of standardized, easy-to-use, and robust diagnostic tools that would allow the direct detection of both the MTB complex and resistance to key antibiotics, such as rifampicin (RIF). The latter result can serve as marker for multidrug-resistant MTB (MDR TB) and has been reported in > 95% of the MDR-TB isolates.4, 5
The rapid availability of reliable test results is likely to directly translate into sound patient management decisions that, ultimately, will cure the individual patient and break the chain of TB transmission in the community.2
Cepheid's (Sunnyvale, CA, U.S.A.) Xpert MTB/RIF assay6, 7
meets the demands outlined above in a remarkable manner. It is a nucleic-acids amplification test for 1) the detection of MTB complex DNA in sputum or concentrated sputum sediments; and 2) the detection of RIF resistance-associated mutations of the rpoB
It is designed for use with Cepheid's GeneXpert Dx System that integrates and automates sample processing, nucleic acid amplification, and detection of the target sequences using real-time PCR and reverse transcriptase PCR. The system consists of an instrument, personal computer, barcode scanner, and preloaded software for running tests and viewing the results.9
It employs single-use disposable Xpert MTB/RIF cartridges that hold PCR reagents and host the PCR process. Because the cartridges are self-contained, cross-contamination between samples is eliminated.6
Current nucleic acid amplification methods used to detect MTB are complex, labor-intensive, and technically demanding. The Xpert MTB/RIF assay has the potential to bring standardized, sensitive and very specific diagnostic testing for both TB and drug resistance to universal-access point-of-care settings3
, provided that they will be able to afford it. In order to facilitate access, the Foundation for Innovative New Diagnostics (FIND) has negotiated significant price reductions. Current FIND-negotiated prices, along with the list of countries eligible for the discounts, are available on the web.10
Immunology, Issue 62, tuberculosis, drug resistance, rifampicin, rapid diagnosis, Xpert MTB/RIF test
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
Fluorescence Based Primer Extension Technique to Determine Transcriptional Starting Points and Cleavage Sites of RNases In Vivo
Institutions: University of Tübingen.
Fluorescence based primer extension (FPE) is a molecular method to determine transcriptional starting points or processing sites of RNA molecules. This is achieved by reverse transcription of the RNA of interest using specific fluorescently labeled primers and subsequent analysis of the resulting cDNA fragments by denaturing polyacrylamide gel electrophoresis. Simultaneously, a traditional Sanger sequencing reaction is run on the gel to map the ends of the cDNA fragments to their exact corresponding bases. In contrast to 5'-RACE (Rapid Amplification of cDNA Ends), where the product must be cloned and multiple candidates sequenced, the bulk of cDNA fragments generated by primer extension can be simultaneously detected in one gel run. In addition, the whole procedure (from reverse transcription to final analysis of the results) can be completed in one working day. By using fluorescently labeled primers, the use of hazardous radioactive isotope labeled reagents can be avoided and processing times are reduced as products can be detected during the electrophoresis procedure.
In the following protocol, we describe an in vivo
fluorescent primer extension method to reliably and rapidly detect the 5' ends of RNAs to deduce transcriptional starting points and RNA processing sites (e.g.,
by toxin-antitoxin system components) in S. aureus, E. coli
and other bacteria.
Molecular Biology, Issue 92, Primer extension, RNA mapping, 5' end, fluorescent primer, transcriptional starting point, TSP, RNase, toxin-antitoxin, cleavage site, gel electrophoresis, DNA isolation, RNA processing
Enhanced Northern Blot Detection of Small RNA Species in Drosophila Melanogaster
Institutions: Institut de Génétique et de Biologie Moléculaire et Cellulaire, Istituto Italiano di Tecnologia.
The last decades have witnessed the explosion of scientific interest around gene expression control mechanisms at the RNA level. This branch of molecular biology has been greatly fueled by the discovery of noncoding RNAs as major players in post-transcriptional regulation. Such a revolutionary perspective has been accompanied and triggered by the development of powerful technologies for profiling short RNAs expression, both at the high-throughput level (genome-wide identification) or as single-candidate analysis (steady state accumulation of specific species). Although several state-of-art strategies are currently available for dosing or visualizing such fleeing molecules, Northern Blot assay remains the eligible approach in molecular biology for immediate and accurate evaluation of RNA expression. It represents a first step toward the application of more sophisticated, costly technologies and, in many cases, remains a preferential method to easily gain insights into RNA biology. Here we overview an efficient protocol (Enhanced Northern Blot) for detecting weakly expressed microRNAs (or other small regulatory RNA species) from Drosophila melanogaster
whole embryos, manually dissected larval/adult tissues or in vitro
cultured cells. A very limited amount of RNA is required and the use of material from flow cytometry-isolated cells can be also envisaged.
Molecular Biology, Issue 90, Northern blotting, Noncoding RNAs, microRNAs, rasiRNA, Gene expression, Gcm/Glide, Drosophila melanogaster
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Polysome Fractionation and Analysis of Mammalian Translatomes on a Genome-wide Scale
Institutions: McGill University, Karolinska Institutet, McGill University.
mRNA translation plays a central role in the regulation of gene expression and represents the most energy consuming process in mammalian cells. Accordingly, dysregulation of mRNA translation is considered to play a major role in a variety of pathological states including cancer. Ribosomes also host chaperones, which facilitate folding of nascent polypeptides, thereby modulating function and stability of newly synthesized polypeptides. In addition, emerging data indicate that ribosomes serve as a platform for a repertoire of signaling molecules, which are implicated in a variety of post-translational modifications of newly synthesized polypeptides as they emerge from the ribosome, and/or components of translational machinery. Herein, a well-established method of ribosome fractionation using sucrose density gradient centrifugation is described. In conjunction with the in-house developed “anota” algorithm this method allows direct determination of differential translation of individual mRNAs on a genome-wide scale. Moreover, this versatile protocol can be used for a variety of biochemical studies aiming to dissect the function of ribosome-associated protein complexes, including those that play a central role in folding and degradation of newly synthesized polypeptides.
Biochemistry, Issue 87, Cells, Eukaryota, Nutritional and Metabolic Diseases, Neoplasms, Metabolic Phenomena, Cell Physiological Phenomena, mRNA translation, ribosomes,
protein synthesis, genome-wide analysis, translatome, mTOR, eIF4E, 4E-BP1
Highly Efficient Ligation of Small RNA Molecules for MicroRNA Quantitation by High-Throughput Sequencing
Institutions: University of Colorado, Boulder, University of Colorado, Denver.
MiRNA cloning and high-throughput sequencing, termed miR-Seq, stands alone as a transcriptome-wide approach to quantify miRNAs with single nucleotide resolution. This technique captures miRNAs by attaching 3’ and 5’ oligonucleotide adapters to miRNA molecules and allows de novo
miRNA discovery. Coupling with powerful next-generation sequencing platforms, miR-Seq has been instrumental in the study of miRNA biology. However, significant biases introduced by oligonucleotide ligation steps have prevented miR-Seq from being employed as an accurate quantitation tool. Previous studies demonstrate that biases in current miR-Seq methods often lead to inaccurate miRNA quantification with errors up to 1,000-fold for some miRNAs1,2
. To resolve these biases imparted by RNA ligation, we have developed a small RNA ligation method that results in ligation efficiencies of over 95% for both 3’ and 5′ ligation steps. Benchmarking this improved library construction method using equimolar or differentially mixed synthetic miRNAs, consistently yields reads numbers with less than two-fold deviation from the expected value. Furthermore, this high-efficiency miR-Seq method permits accurate genome-wide miRNA profiling from in vivo
total RNA samples2
Molecular Biology, Issue 93, RNA, ligation, miRNA, miR-Seq, linker, oligonucleotide, high-throughput sequencing
PAR-CliP - A Method to Identify Transcriptome-wide the Binding Sites of RNA Binding Proteins
Institutions: Rockefeller University, Max-Delbrück-Center for Molecular Medicine, Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Biozentrum der Universität Basel and Swiss Institute of Bioinformatics (SIB), Rockefeller University.
RNA transcripts are subjected to post-transcriptional gene regulation by interacting with hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) that are often expressed in a cell-type dependently. To understand how the interplay of these RNA-binding factors affects the regulation of individual transcripts, high resolution maps of in vivo
protein-RNA interactions are necessary1
A combination of genetic, biochemical and computational approaches are typically applied to identify RNA-RBP or RNA-RNP interactions. Microarray profiling of RNAs associated with immunopurified RBPs (RIP-Chip)2
defines targets at a transcriptome level, but its application is limited to the characterization of kinetically stable interactions and only in rare cases3,4
allows to identify the RBP recognition element (RRE) within the long target RNA. More direct RBP target site information is obtained by combining in vivo
followed by the isolation of crosslinked RNA segments and cDNA sequencing (CLIP)10
. CLIP was used to identify targets of a number of RBPs11-17
. However, CLIP is limited by the low efficiency of UV 254 nm RNA-protein crosslinking, and the location of the crosslink is not readily identifiable within the sequenced crosslinked fragments, making it difficult to separate UV-crosslinked target RNA segments from background non-crosslinked RNA fragments also present in the sample.
We developed a powerful cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs that we term PAR-CliP (Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation) (see Fig. 1A for an outline of the method). The method relies on the incorporation of photoreactive ribonucleoside analogs, such as 4-thiouridine (4-SU) and 6-thioguanosine (6-SG) into nascent RNA transcripts by living cells. Irradiation of the cells by UV light of 365 nm induces efficient crosslinking of photoreactive nucleoside-labeled cellular RNAs to interacting RBPs. Immunoprecipitation of the RBP of interest is followed by isolation of the crosslinked and coimmunoprecipitated RNA. The isolated RNA is converted into a cDNA library and deep sequenced using Solexa technology. One characteristic feature of cDNA libraries prepared by PAR-CliP is that the precise position of crosslinking can be identified by mutations residing in the sequenced cDNA. When using 4-SU, crosslinked sequences thymidine to cytidine transition, whereas using 6-SG results in guanosine to adenosine mutations. The presence of the mutations in crosslinked sequences makes it possible to separate them from the background of sequences derived from abundant cellular RNAs.
Application of the method to a number of diverse RNA binding proteins was reported in Hafner et al.18
Cellular Biology, Issue 41, UV crosslinking, RNA binding proteins, RNA binding motif, 4-thiouridine, 6-thioguanosine
Production of Xenopus tropicalis Egg Extracts to Identify Microtubule-associated RNAs
Institutions: Massachusetts General Hospital, Harvard Medical School.
Many organisms localize mRNAs to specific subcellular destinations to spatially and temporally control gene expression. Recent studies have demonstrated that the majority of the transcriptome is localized to a nonrandom position in cells and embryos. One approach to identify localized mRNAs is to biochemically purify a cellular structure of interest and to identify all associated transcripts. Using recently developed high-throughput sequencing technologies it is now straightforward to identify all RNAs associated with a subcellular structure. To facilitate transcript identification it is necessary to work with an organism with a fully sequenced genome. One attractive system for the biochemical purification of subcellular structures are egg extracts produced from the frog Xenopus laevis.
However, X. laevis
currently does not have a fully sequenced genome, which hampers transcript identification. In this article we describe a method to produce egg extracts from a related frog, X. tropicalis,
that has a fully sequenced genome. We provide details for microtubule polymerization, purification and transcript isolation. While this article describes a specific method for identification of microtubule-associated transcripts, we believe that it will be easily applied to other subcellular structures and will provide a powerful method for identification of localized RNAs.
Molecular Biology, Issue 76, Genetics, Developmental Biology, Biochemistry, Bioengineering, Cellular Biology, RNA, Messenger, Stored, RNA Processing, Post-Transcriptional, Xenopus, microtubules, egg extract, purification, RNA localization, mRNA, Xenopus tropicalis, eggs, animal model
Site-specific Bacterial Chromosome Engineering: ΦC31 Integrase Mediated Cassette Exchange (IMCE)
Institutions: University of Waterloo.
The bacterial chromosome may be used to stably maintain foreign DNA in the mega-base range1
. Integration into the chromosome circumvents issues such as plasmid replication, plasmid stability, plasmid incompatibility, and plasmid copy number variance. This method uses the site-specific integrase from the Streptomyces
phage (Φ) C312,3
. The ΦC31 integrase catalyzes a direct recombination between two specific DNA sites: attB
(34 and 39 bp, respectively)4
. This recombination is stable and does not revert5
. A "landing pad" (LP) sequence consisting of a spectinomycin- resistance gene, aadA
), and the E. coli
ß-glucuronidase gene (uidA
) flanked by attP
sites has been integrated into the chromosomes of Sinorhizobium meliloti, Ochrobactrum anthropi,
and Agrobacterium tumefaciens
in an intergenic region, the ampC
locus, and the tetA
locus, respectively. S. meliloti
is used in this protocol. Mobilizable donor vectors containing attB
sites flanking a stuffer red fluorescent protein (rfp)
gene and an antibiotic resistance gene have also been constructed. In this example the gentamicin resistant plasmid pJH110 is used. The rfp
may be replaced with a desired construct using Sph
I and Pst
I. Alternatively a synthetic construct flanked by attB
sites may be sub-cloned into a mobilizable vector such as pK19mob7
. The expression of the ΦC31 integrase gene (cloned from pHS628
) is driven by the lac
promoter, on a mobilizable broad host range plasmid pRK78139
A tetraparental mating protocol is used to transfer the donor cassette into the LP strain thereby replacing the markers in the LP sequence with the donor cassette. These cells are trans-integrants. Trans-integrants are formed with a typical efficiency of 0.5%. Trans-integrants are typically found within the first 500-1,000 colonies screened by antibiotic sensitivity or blue-white screening using 5-bromo-4-chloro-3-indolyl-beta-D-glucuronic acid (X-gluc). This protocol contains the mating and selection procedures for creating and isolating trans-integrants.
Bioengineering, Issue 61, ΦC31 Integrase, Rhizobiales, Chromosome Engineering, bacterial genetics
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Locked Nucleic Acid Flow Cytometry-fluorescence in situ Hybridization (LNA flow-FISH): a Method for Bacterial Small RNA Detection
Institutions: Naval Research Laboratory.
Fluorescence in situ
hybridization (FISH) is a powerful technique that is used to detect and localize specific nucleic acid sequences in the cellular environment. In order to increase throughput, FISH can be combined with flow cytometry (flow-FISH) to enable the detection of targeted nucleic acid sequences in thousands of individual cells. As a result, flow-FISH offers a distinct advantage over lysate/ensemble-based nucleic acid detection methods because each cell is treated as an independent observation, thereby permitting stronger statistical and variance analyses. These attributes have prompted the use of FISH and flow-FISH methods in a number of different applications and the utility of these methods has been successfully demonstrated in telomere length determination1,2
, cellular identification and gene expression3,4
, monitoring viral multiplication in infected cells5
, and bacterial community analysis and enumeration6
Traditionally, the specificity of FISH and flow-FISH methods has been imparted by DNA oligonucleotide probes. Recently however, the replacement of DNA oligonucleotide probes with nucleic acid analogs as FISH and flow-FISH probes has increased both the sensitivity and specificity of each technique due to the higher melting temperatures (Tm
) of these analogs for natural nucleic acids7,8
. Locked nucleic acid (LNA) probes are a type of nucleic acid analog that contain LNA nucleotides spiked throughout a DNA or RNA sequence9,10
. When coupled with flow-FISH, LNA probes have previously been shown to outperform conventional DNA probes7,11
and have been successfully used to detect eukaryotic mRNA12
and viral RNA in mammalian cells5
Here we expand this capability and describe a LNA flow-FISH method which permits the specific detection of RNA in bacterial cells (Figure 1
). Specifically, we are interested in the detection of small non-coding regulatory RNA (sRNA) which have garnered considerable interest in the past few years as they have been found to serve as key regulatory elements in many critical cellular processes13
. However, there are limited tools to study sRNAs and the challenges of detecting sRNA in bacterial cells is due in part to the relatively small size (typically 50-300 nucleotides in length) and low abundance of sRNA molecules as well as the general difficulty in working with smaller biological cells with varying cellular membranes. In this method, we describe fixation and permeabilzation conditions that preserve the structure of bacterial cells and permit the penetration of LNA probes as well as signal amplification steps which enable the specific detection of low abundance sRNA (Figure 2
Immunology, Issue 59, fluorescence in situ hybridization, FISH, flow cytometry, locked nucleic acid, sRNA, Vibrio
A Toolkit to Enable Hydrocarbon Conversion in Aqueous Environments
Institutions: Delft University of Technology, Delft University of Technology.
This work puts forward a toolkit that enables the conversion of alkanes by Escherichia coli
and presents a proof of principle of its applicability. The toolkit consists of multiple standard interchangeable parts (BioBricks)9
addressing the conversion of alkanes, regulation of gene expression and survival in toxic hydrocarbon-rich environments.
A three-step pathway for alkane degradation was implemented in E. coli
to enable the conversion of medium- and long-chain alkanes to their respective alkanols, alkanals and ultimately alkanoic-acids. The latter were metabolized via the native β-oxidation pathway. To facilitate the oxidation of medium-chain alkanes (C5-C13) and cycloalkanes (C5-C8), four genes (alkB2
) of the alkane hydroxylase system from Gordonia
were transformed into E. coli
. For the conversion of long-chain alkanes (C15-C36), theladA
gene from Geobacillus thermodenitrificans
was implemented. For the required further steps of the degradation process, ADH
and ALDH (
originating from G. thermodenitrificans
) were introduced10,11
. The activity was measured by resting cell assays. For each oxidative step, enzyme activity was observed.
To optimize the process efficiency, the expression was only induced under low glucose conditions: a substrate-regulated promoter, pCaiF, was used. pCaiF is present in E. coli
K12 and regulates the expression of the genes involved in the degradation of non-glucose carbon sources.
The last part of the toolkit - targeting survival - was implemented using solvent tolerance genes, PhPFDα and β, both from Pyrococcus horikoshii
OT3. Organic solvents can induce cell stress and decreased survivability by negatively affecting protein folding. As chaperones, PhPFDα and β improve the protein folding process e.g.
under the presence of alkanes. The expression of these genes led to an improved hydrocarbon tolerance shown by an increased growth rate (up to 50%) in the presences of 10% n
-hexane in the culture medium were observed.
Summarizing, the results indicate that the toolkit enables E. coli
to convert and tolerate hydrocarbons in aqueous environments. As such, it represents an initial step towards a sustainable solution for oil-remediation using a synthetic biology approach.
Bioengineering, Issue 68, Microbiology, Biochemistry, Chemistry, Chemical Engineering, Oil remediation, alkane metabolism, alkane hydroxylase system, resting cell assay, prefoldin, Escherichia coli, synthetic biology, homologous interaction mapping, mathematical model, BioBrick, iGEM
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Unraveling the Unseen Players in the Ocean - A Field Guide to Water Chemistry and Marine Microbiology
Institutions: San Diego State University, University of California San Diego.
Here we introduce a series of thoroughly tested and well standardized research protocols adapted for use in remote marine environments. The sampling protocols include the assessment of resources available to the microbial community (dissolved organic carbon, particulate organic matter, inorganic nutrients), and a comprehensive description of the viral and bacterial communities (via direct viral and microbial counts, enumeration of autofluorescent microbes, and construction of viral and microbial metagenomes). We use a combination of methods, which represent a dispersed field of scientific disciplines comprising already established protocols and some of the most recent techniques developed. Especially metagenomic sequencing techniques used for viral and bacterial community characterization, have been established only in recent years, and are thus still subjected to constant improvement. This has led to a variety of sampling and sample processing procedures currently in use. The set of methods presented here provides an up to date approach to collect and process environmental samples. Parameters addressed with these protocols yield the minimum on information essential to characterize and understand the underlying mechanisms of viral and microbial community dynamics. It gives easy to follow guidelines to conduct comprehensive surveys and discusses critical steps and potential caveats pertinent to each technique.
Environmental Sciences, Issue 93, dissolved organic carbon, particulate organic matter, nutrients, DAPI, SYBR, microbial metagenomics, viral metagenomics, marine environment
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Using Coculture to Detect Chemically Mediated Interspecies Interactions
Institutions: University of North Carolina at Chapel Hill .
In nature, bacteria rarely exist in isolation; they are instead surrounded by a diverse array of other microorganisms that alter the local environment by secreting metabolites. These metabolites have the potential to modulate the physiology and differentiation of their microbial neighbors and are likely important factors in the establishment and maintenance of complex microbial communities. We have developed a fluorescence-based coculture screen to identify such chemically mediated microbial interactions. The screen involves combining a fluorescent transcriptional reporter strain with environmental microbes on solid media and allowing the colonies to grow in coculture. The fluorescent transcriptional reporter is designed so that the chosen bacterial strain fluoresces when it is expressing a particular phenotype of interest (i.e.
biofilm formation, sporulation, virulence factor production, etc
.) Screening is performed under growth conditions where this phenotype is not
expressed (and therefore the reporter strain is typically nonfluorescent). When an environmental microbe secretes a metabolite that activates this phenotype, it diffuses through the agar and activates the fluorescent reporter construct. This allows the inducing-metabolite-producing microbe to be detected: they are the nonfluorescent colonies most proximal to the fluorescent colonies. Thus, this screen allows the identification of environmental microbes that produce diffusible metabolites that activate a particular physiological response in a reporter strain. This publication discusses how to: a) select appropriate coculture screening conditions, b) prepare the reporter and environmental microbes for screening, c) perform the coculture screen, d) isolate putative inducing organisms, and e) confirm their activity in a secondary screen. We developed this method to screen for soil organisms that activate biofilm matrix-production in Bacillus subtilis
; however, we also discuss considerations for applying this approach to other genetically tractable bacteria.
Microbiology, Issue 80, High-Throughput Screening Assays, Genes, Reporter, Microbial Interactions, Soil Microbiology, Coculture, microbial interactions, screen, fluorescent transcriptional reporters, Bacillus subtilis
Genetic Manipulation in Δku80 Strains for Functional Genomic Analysis of Toxoplasma gondii
Institutions: The Geisel School of Medicine at Dartmouth.
Targeted genetic manipulation using homologous recombination is the method of choice for functional genomic analysis to obtain a detailed view of gene function and phenotype(s). The development of mutant strains with targeted gene deletions, targeted mutations, complemented gene function, and/or tagged genes provides powerful strategies to address gene function, particularly if these genetic manipulations can be efficiently targeted to the gene locus of interest using integration mediated by double cross over homologous recombination.
Due to very high rates of nonhomologous recombination, functional genomic analysis of Toxoplasma gondii
has been previously limited by the absence of efficient methods for targeting gene deletions and gene replacements to specific genetic loci. Recently, we abolished the major pathway of nonhomologous recombination in type I and type II strains of T. gondii
by deleting the gene encoding the KU80 protein1,2
. The Δku80
strains behave normally during tachyzoite (acute) and bradyzoite (chronic) stages in vitro
and in vivo
and exhibit essentially a 100% frequency of homologous recombination. The Δku80
strains make functional genomic studies feasible on the single gene as well as on the genome scale1-4
Here, we report methods for using type I and type II Δku80Δhxgprt
strains to advance gene targeting approaches in T. gondii
. We outline efficient methods for generating gene deletions, gene replacements, and tagged genes by targeted insertion or deletion of the hypoxanthine-xanthine-guanine phosphoribosyltransferase (HXGPRT
) selectable marker. The described gene targeting protocol can be used in a variety of ways in Δku80
strains to advance functional analysis of the parasite genome and to develop single strains that carry multiple targeted genetic manipulations. The application of this genetic method and subsequent phenotypic assays will reveal fundamental and unique aspects of the biology of T. gondii
and related significant human pathogens that cause malaria (Plasmodium
sp.) and cryptosporidiosis (Cryptosporidium
Infectious Diseases, Issue 77, Genetics, Microbiology, Infection, Medicine, Immunology, Molecular Biology, Cellular Biology, Biomedical Engineering, Bioengineering, Genomics, Parasitology, Pathology, Apicomplexa, Coccidia, Toxoplasma, Genetic Techniques, Gene Targeting, Eukaryota, Toxoplasma gondii, genetic manipulation, gene targeting, gene deletion, gene replacement, gene tagging, homologous recombination, DNA, sequencing
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Monitoring Intraspecies Competition in a Bacterial Cell Population by Cocultivation of Fluorescently Labelled Strains
Institutions: Georg-August University.
Many microorganisms such as bacteria proliferate extremely fast and the populations may reach high cell densities. Small fractions of cells in a population always have accumulated mutations that are either detrimental or beneficial for the cell. If the fitness effect of a mutation provides the subpopulation with a strong selective growth advantage, the individuals of this subpopulation may rapidly outcompete and even completely eliminate their immediate fellows. Thus, small genetic changes and selection-driven accumulation of cells that have acquired beneficial mutations may lead to a complete shift of the genotype of a cell population. Here we present a procedure to monitor the rapid clonal expansion and elimination of beneficial and detrimental mutations, respectively, in a bacterial cell population over time by cocultivation of fluorescently labeled individuals of the Gram-positive model bacterium Bacillus subtilis
. The method is easy to perform and very illustrative to display intraspecies competition among the individuals in a bacterial cell population.
Cellular Biology, Issue 83, Bacillus subtilis, evolution, adaptation, selective pressure, beneficial mutation, intraspecies competition, fluorophore-labelling, Fluorescence Microscopy
Metabolic Pathway Confirmation and Discovery Through 13C-labeling of Proteinogenic Amino Acids
Institutions: Washington University, Washington University, Washington University.
Microbes have complex metabolic pathways that can be investigated using biochemistry and functional genomics methods. One important technique to examine cell central metabolism and discover new enzymes is 13
C-assisted metabolism analysis 1. This technique is based on isotopic labeling, whereby microbes are fed with a 13
C labeled substrates. By tracing the atom transition paths between metabolites in the biochemical network, we can determine functional pathways and discover new enzymes.
As a complementary method to transcriptomics and proteomics, approaches for isotopomer-assisted analysis of metabolic pathways contain three major steps 2
, we grow cells with 13
C labeled substrates. In this step, the composition of the medium and the selection of labeled substrates are two key factors. To avoid measurement noises from non-labeled carbon in nutrient supplements, a minimal medium with a sole carbon source is required. Further, the choice of a labeled substrate is based on how effectively it will elucidate the pathway being analyzed. Because novel enzymes often involve different reaction stereochemistry or intermediate products, in general, singly labeled carbon substrates are more informative for detection of novel pathways than uniformly labeled ones for detection of novel pathways3, 4
, we analyze amino acid labeling patterns using GC-MS. Amino acids are abundant in protein and thus can be obtained from biomass hydrolysis. Amino acids can be derivatized by N-(tert-butyldimethylsilyl)-N-methyltrifluoroacetamide (TBDMS) before GC separation. TBDMS derivatized amino acids can be fragmented by MS and result in different arrays of fragments. Based on the mass to charge (m/z) ratio of fragmented and unfragmented amino acids, we can deduce the possible labeled patterns of the central metabolites that are precursors of the amino acids. Third
, we trace 13C carbon transitions in the proposed pathways and, based on the isotopomer data, confirm whether these pathways are active 2
. Measurement of amino acids provides isotopic labeling information about eight crucial precursor metabolites in the central metabolism. These metabolic key nodes can reflect the functions of associated central pathways.
C-assisted metabolism analysis via proteinogenic amino acids can be widely used for functional characterization of poorly-characterized microbial metabolism1
. In this protocol, we will use Cyanothece
51142 as the model strain to demonstrate the use of labeled carbon substrates for discovering new enzymatic functions.
Molecular Biology, Issue 59, GC-MS, novel pathway, metabolism, labeling, phototrophic microorganism
Massively Parallel Reporter Assays in Cultured Mammalian Cells
Institutions: Broad Institute.
The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.
Genetics, Issue 90, gene regulation, transcriptional regulation, sequence-activity mapping, reporter assay, library cloning, transfection, tag sequencing, mammalian cells
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif