In vivo methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris Hildenborough.
21 Related JoVE Articles!
Enrichment of NK Cells from Human Blood with the RosetteSep Kit from StemCell Technologies
Institutions: University of California, Irvine (UCI).
Natural killer (NK) cells are large granular cytotoxic lymphocytes that belong to the innate immune system and play major roles in fighting against cancer and infections, but are also implicated in the early stages of pregnancy and transplant rejection. These cells are present in peripheral blood, from which they can be isolated. Cells can be isolated using either positive or negative selection. For positive selection we use antibodies directed to a surface marker present only on the cells of interest whereas for negative selection we use cocktails of antibodies targeted to surface markers present on all cells but the cells of interest. This latter technique presents the advantage of leaving the cells of interest free of antibodies, thereby reducing the risk of unwanted cell activation or differenciation. In this video-protocol we demonstrate how to separate NK cells from human blood by negative selection, using the RosetteSep kit from StemCell technologies. The procedure involves obtaining human peripheral blood (under an institutional review board-approved protocol to protect the human subjects) and mixing it with a cocktail of antibodies that will bind to markers absent on NK cells, but present on all other mononuclear cells present in peripheral blood (e.g., T lymphocytes, monocytes...). The antibodies present in the cocktail are conjugated to antibodies directed to glycophorin A on erythrocytes. All unwanted cells and red blood cells will therefore be trapped in complexes. The mix of blood and antibody cocktail is then diluted, overlayed on a Histopaque gradient, and centrifuged. NK cells (>80% pure) can be collected at the interface between the Histopaque and the diluted plasma. Similar cocktails are available for enrichment of other cell populations, such as human T lymphocytes.
Immunology, issue 8, blood, cell isolation, natural killer, lymphocyte, primary cells, negative selection, PBMC, Ficoll gradient, cell separation
The ChroP Approach Combines ChIP and Mass Spectrometry to Dissect Locus-specific Proteomic Landscapes of Chromatin
Institutions: European Institute of Oncology.
Chromatin is a highly dynamic nucleoprotein complex made of DNA and proteins that controls various DNA-dependent processes. Chromatin structure and function at specific regions is regulated by the local enrichment of histone post-translational modifications (hPTMs) and variants, chromatin-binding proteins, including transcription factors, and DNA methylation. The proteomic characterization of chromatin composition at distinct functional regions has been so far hampered by the lack of efficient protocols to enrich such domains at the appropriate purity and amount for the subsequent in-depth analysis by Mass Spectrometry (MS). We describe here a newly designed chromatin proteomics strategy, named ChroP (Chromatin Proteomics
), whereby a preparative chromatin immunoprecipitation is used to isolate distinct chromatin regions whose features, in terms of hPTMs, variants and co-associated non-histonic proteins, are analyzed by MS. We illustrate here the setting up of ChroP for the enrichment and analysis of transcriptionally silent heterochromatic regions, marked by the presence of tri-methylation of lysine 9 on histone H3. The results achieved demonstrate the potential of ChroP
in thoroughly characterizing the heterochromatin proteome and prove it as a powerful analytical strategy for understanding how the distinct protein determinants of chromatin interact and synergize to establish locus-specific structural and functional configurations.
Biochemistry, Issue 86, chromatin, histone post-translational modifications (hPTMs), epigenetics, mass spectrometry, proteomics, SILAC, chromatin immunoprecipitation , histone variants, chromatome, hPTMs cross-talks
Scalable High Throughput Selection From Phage-displayed Synthetic Antibody Libraries
Institutions: The Recombinant Antibody Network, University of Toronto, University of California, San Francisco at Mission Bay, The University of Chicago.
The demand for antibodies that fulfill the needs of both basic and clinical research applications is high and will dramatically increase in the future. However, it is apparent that traditional monoclonal technologies are not alone up to this task. This has led to the development of alternate methods to satisfy the demand for high quality and renewable affinity reagents to all accessible elements of the proteome. Toward this end, high throughput methods for conducting selections from phage-displayed synthetic antibody libraries have been devised for applications involving diverse antigens and optimized for rapid throughput and success. Herein, a protocol is described in detail that illustrates with video demonstration the parallel selection of Fab-phage clones from high diversity libraries against hundreds of targets using either a manual 96 channel liquid handler or automated robotics system. Using this protocol, a single user can generate hundreds of antigens, select antibodies to them in parallel and validate antibody binding within 6-8 weeks. Highlighted are: i) a viable antigen format, ii) pre-selection antigen characterization, iii) critical steps that influence the selection of specific and high affinity clones, and iv) ways of monitoring selection effectiveness and early stage antibody clone characterization. With this approach, we have obtained synthetic antibody fragments (Fabs) to many target classes including single-pass membrane receptors, secreted protein hormones, and multi-domain intracellular proteins. These fragments are readily converted to full-length antibodies and have been validated to exhibit high affinity and specificity. Further, they have been demonstrated to be functional in a variety of standard immunoassays including Western blotting, ELISA, cellular immunofluorescence, immunoprecipitation and related assays. This methodology will accelerate antibody discovery and ultimately bring us closer to realizing the goal of generating renewable, high quality antibodies to the proteome.
Immunology, Issue 95, Bacteria, Viruses, Amino Acids, Peptides, and Proteins, Nucleic Acids, Nucleotides, and Nucleosides, Life Sciences (General), phage display, synthetic antibodies, high throughput, antibody selection, scalable methodology
Linear Amplification Mediated PCR – Localization of Genetic Elements and Characterization of Unknown Flanking DNA
Institutions: National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ).
Linear-amplification mediated PCR (LAM-PCR) has been developed to study hematopoiesis in gene corrected cells of patients treated by gene therapy with integrating vector systems. Due to the stable integration of retroviral vectors, integration sites can be used to study the clonal fate of individual cells and their progeny. LAM- PCR for the first time provided evidence that leukemia in gene therapy treated patients originated from provirus induced overexpression of a neighboring proto-oncogene. The high sensitivity and specificity of LAM-PCR compared to existing methods like inverse PCR and ligation mediated (LM)-PCR is achieved by an initial preamplification step (linear PCR of 100 cycles) using biotinylated vector specific primers which allow subsequent reaction steps to be carried out on solid phase (magnetic beads). LAM-PCR is currently the most sensitive method available to identify unknown DNA which is located in the proximity of known DNA. Recently, a variant of LAM-PCR has been developed that circumvents restriction digest thus abrogating retrieval bias of integration sites and enables a comprehensive analysis of provirus locations in host genomes. The following protocol explains step-by-step the amplification of both 3’- and 5’- sequences adjacent to the integrated lentiviral vector.
Genetics, Issue 88, gene therapy, integrome, integration site analysis, LAM-PCR, retroviral vectors, lentiviral vectors, AAV, deep sequencing, clonal inventory, mutagenesis screen
Generation of Genomic Deletions in Mammalian Cell Lines via CRISPR/Cas9
Institutions: Harvard Medical School, Boston Children's Hospital, Dana-Farber Cancer Institute, Howard Hughes Medical Institute.
The prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system may be re-purposed for site-specific eukaryotic genome engineering. CRISPR/Cas9 is an inexpensive, facile, and efficient genome editing tool that allows genetic perturbation of genes and genetic elements. Here we present a simple methodology for CRISPR design, cloning, and delivery for the production of genomic deletions. In addition, we describe techniques for deletion, identification, and characterization. This strategy relies on cellular delivery of a pair of chimeric single guide RNAs (sgRNAs) to create two double strand breaks (DSBs) at a locus in order to delete the intervening DNA segment by non-homologous end joining (NHEJ) repair. Deletions have potential advantages as compared to single-site small indels given the efficiency of biallelic modification, ease of rapid identification by PCR, predictability of loss-of-function, and utility for the study of non-coding elements. This approach can be used for efficient loss-of-function studies of genes and genetic elements in mammalian cell lines.
Molecular Biology, Issue 95, CRISPR, Cas9, Genome Engineering, Gene Knockout, Genomic Deletion, Gene Regulation
Automating ChIP-seq Experiments to Generate Epigenetic Profiles on 10,000 HeLa Cells
Institutions: Diagenode S.A., Diagenode Inc..
Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) is a technique of choice for studying protein-DNA interactions. ChIP-seq has been used for mapping protein-DNA interactions and allocating histones modifications. The procedure is tedious and time consuming, and one of the major limitations is the requirement for high amounts of starting material, usually millions of cells. Automation of chromatin immunoprecipitation assays is possible when the procedure is based on the use of magnetic beads. Successful automated protocols of chromatin immunoprecipitation and library preparation have been specifically designed on a commercially available robotic liquid handling system dedicated mainly to automate epigenetic assays. First, validation of automated ChIP-seq assays using antibodies directed against various histone modifications was shown, followed by optimization of the automated protocols to perform chromatin immunoprecipitation and library preparation starting with low cell numbers. The goal of these experiments is to provide a valuable tool for future epigenetic analysis of specific cell types, sub-populations, and biopsy samples.
Molecular Biology, Issue 94, Automation, chromatin immunoprecipitation, low DNA amounts, histone antibodies, sequencing, library preparation
Subcloning Plus Insertion (SPI) - A Novel Recombineering Method for the Rapid Construction of Gene Targeting Vectors
Institutions: University of Leicester, Center for Fisheries, Environment and Aquaculture Sciences, University of Leicester.
Gene targeting refers to the precise modification of a genetic locus using homologous recombination. The generation of novel cell lines and transgenic mouse models using this method necessitates the construction of a ‘targeting’ vector, which contains homologous DNA sequences to the target gene, and has for many years been a limiting step in the process. Vector construction can be performed in vivo
in Escherichia coli
cells using homologous recombination mediated by phage recombinases using a technique termed recombineering. Recombineering is the preferred technique to subclone the long homology sequences (>4kb) and various targeting elements including selection markers that are required to mediate efficient allelic exchange between a targeting vector and its cognate genomic locus. Typical recombineering protocols follow an iterative scheme of step-wise integration of the targeting elements and require intermediate purification and transformation steps. Here, we present a novel recombineering methodology of vector assembly using a multiplex approach. Plasmid gap repair is performed by the simultaneous capture of genomic sequence from mouse Bacterial Artificial Chromosome libraries and the insertion of dual bacterial and mammalian selection markers. This subcloning plus insertion method is highly efficient and yields a majority of correct recombinants. We present data for the construction of different types of conditional gene knockout, or knock-in, vectors and BAC reporter vectors that have been constructed using this method. SPI vector construction greatly extends the repertoire of the recombineering toolbox and provides a simple, rapid and cost-effective method of constructing these highly complex vectors.
Molecular Biology, Issue 95, recombineering, gap-repair, subcloning plus insertion, transgene, knockout, mouse
Enhanced Reduced Representation Bisulfite Sequencing for Assessment of DNA Methylation at Base Pair Resolution
Institutions: Weill Cornell Medical College, Weill Cornell Medical College, Weill Cornell Medical College, University of Michigan.
DNA methylation pattern mapping is heavily studied in normal and diseased tissues. A variety of methods have been established to interrogate the cytosine methylation patterns in cells. Reduced representation of whole genome bisulfite sequencing was developed to detect quantitative base pair resolution cytosine methylation patterns at GC-rich genomic loci. This is accomplished by combining the use of a restriction enzyme followed by bisulfite conversion. Enhanced Reduced Representation Bisulfite Sequencing (ERRBS) increases the biologically relevant genomic loci covered and has been used to profile cytosine methylation in DNA from human, mouse and other organisms. ERRBS initiates with restriction enzyme digestion of DNA to generate low molecular weight fragments for use in library preparation. These fragments are subjected to standard library construction for next generation sequencing. Bisulfite conversion of unmethylated cytosines prior to the final amplification step allows for quantitative base resolution of cytosine methylation levels in covered genomic loci. The protocol can be completed within four days. Despite low complexity in the first three bases sequenced, ERRBS libraries yield high quality data when using a designated sequencing control lane. Mapping and bioinformatics analysis is then performed and yields data that can be easily integrated with a variety of genome-wide platforms. ERRBS can utilize small input material quantities making it feasible to process human clinical samples and applicable in a range of research applications. The video produced demonstrates critical steps of the ERRBS protocol.
Genetics, Issue 96, Epigenetics, bisulfite sequencing, DNA methylation, genomic DNA, 5-methylcytosine, high-throughput
Genome-wide Snapshot of Chromatin Regulators and States in Xenopus Embryos by ChIP-Seq
Institutions: MRC National Institute for Medical Research.
The recruitment of chromatin regulators and the assignment of chromatin states to specific genomic loci are pivotal to cell fate decisions and tissue and organ formation during development. Determining the locations and levels of such chromatin features in vivo
will provide valuable information about the spatio-temporal regulation of genomic elements, and will support aspirations to mimic embryonic tissue development in vitro
. The most commonly used method for genome-wide and high-resolution profiling is chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq). This protocol outlines how yolk-rich embryos such as those of the frog Xenopus
can be processed for ChIP-Seq experiments, and it offers simple command lines for post-sequencing analysis. Because of the high efficiency with which the protocol extracts nuclei from formaldehyde-fixed tissue, the method allows easy upscaling to obtain enough ChIP material for genome-wide profiling. Our protocol has been used successfully to map various DNA-binding proteins such as transcription factors, signaling mediators, components of the transcription machinery, chromatin modifiers and post-translational histone modifications, and for this to be done at various stages of embryogenesis. Lastly, this protocol should be widely applicable to other model and non-model organisms as more and more genome assemblies become available.
Developmental Biology, Issue 96, Chromatin immunoprecipitation, next-generation sequencing, ChIP-Seq, developmental biology, Xenopus embryos, cross-linking, transcription factor, post-sequencing analysis, DNA occupancy, metagene, binding motif, GO term
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Genetic Manipulation in Δku80 Strains for Functional Genomic Analysis of Toxoplasma gondii
Institutions: The Geisel School of Medicine at Dartmouth.
Targeted genetic manipulation using homologous recombination is the method of choice for functional genomic analysis to obtain a detailed view of gene function and phenotype(s). The development of mutant strains with targeted gene deletions, targeted mutations, complemented gene function, and/or tagged genes provides powerful strategies to address gene function, particularly if these genetic manipulations can be efficiently targeted to the gene locus of interest using integration mediated by double cross over homologous recombination.
Due to very high rates of nonhomologous recombination, functional genomic analysis of Toxoplasma gondii
has been previously limited by the absence of efficient methods for targeting gene deletions and gene replacements to specific genetic loci. Recently, we abolished the major pathway of nonhomologous recombination in type I and type II strains of T. gondii
by deleting the gene encoding the KU80 protein1,2
. The Δku80
strains behave normally during tachyzoite (acute) and bradyzoite (chronic) stages in vitro
and in vivo
and exhibit essentially a 100% frequency of homologous recombination. The Δku80
strains make functional genomic studies feasible on the single gene as well as on the genome scale1-4
Here, we report methods for using type I and type II Δku80Δhxgprt
strains to advance gene targeting approaches in T. gondii
. We outline efficient methods for generating gene deletions, gene replacements, and tagged genes by targeted insertion or deletion of the hypoxanthine-xanthine-guanine phosphoribosyltransferase (HXGPRT
) selectable marker. The described gene targeting protocol can be used in a variety of ways in Δku80
strains to advance functional analysis of the parasite genome and to develop single strains that carry multiple targeted genetic manipulations. The application of this genetic method and subsequent phenotypic assays will reveal fundamental and unique aspects of the biology of T. gondii
and related significant human pathogens that cause malaria (Plasmodium
sp.) and cryptosporidiosis (Cryptosporidium
Infectious Diseases, Issue 77, Genetics, Microbiology, Infection, Medicine, Immunology, Molecular Biology, Cellular Biology, Biomedical Engineering, Bioengineering, Genomics, Parasitology, Pathology, Apicomplexa, Coccidia, Toxoplasma, Genetic Techniques, Gene Targeting, Eukaryota, Toxoplasma gondii, genetic manipulation, gene targeting, gene deletion, gene replacement, gene tagging, homologous recombination, DNA, sequencing
Genome-wide Analysis using ChIP to Identify Isoform-specific Gene Targets
Institutions: University of Illinois Chicago - UIC, Universitat Pompeu Fabra, Whitehead Institute for Biomedical Research.
Recruitment of transcriptional and epigenetic factors to their targets is a key step in their regulation. Prominently featured in recruitment are the protein domains that bind to specific histone modifications. One such domain is the plant homeodomain (PHD), found in several chromatin-binding proteins. The epigenetic factor RBP2 has multiple PHD domains, however, they have different functions (Figure 4). In particular, the C-terminal PHD domain, found in a RBP2 oncogenic fusion in human leukemia, binds to trimethylated lysine 4 in histone H3 (H3K4me3)1
. The transcript corresponding to the RBP2 isoform containing the C-terminal PHD accumulates during differentiation of promonocytic, lymphoma-derived, U937 cells into monocytes2
. Consistent with both sets of data, genome-wide analysis showed that in differentiated U937 cells, the RBP2 protein gets localized to genomic regions highly enriched for H3K4me33
. Localization of RBP2 to its targets correlates with a decrease in H3K4me3 due to RBP2 histone demethylase activity and a decrease in transcriptional activity. In contrast, two other PHDs of RBP2 are unable to bind H3K4me3. Notably, the C-terminal domain PHD of RBP2 is absent in the smaller RBP2 isoform4
. It is conceivable that the small isoform of RBP2, which lacks interaction with H3K4me3, differs from the larger isoform in genomic location. The difference in genomic location of RBP2 isoforms may account for the observed diversity in RBP2 function. Specifically, RBP2 is a critical player in cellular differentiation mediated by the retinoblastoma protein (pRB). Consistent with these data, previous genome-wide analysis, without distinction between isoforms, identified two distinct groups of RBP2 target genes: 1) genes bound by RBP2 in a manner that is independent of differentiation; 2) genes bound by RBP2 in a differentiation-dependent manner.
To identify differences in localization between the isoforms we performed genome-wide location analysis by ChIP-Seq. Using antibodies that detect both RBP2 isoforms we have located all RBP2 targets. Additionally we have antibodies that only bind large, and not small RBP2 isoform (Figure 4). After identifying the large isoform targets, one can then subtract them from all RBP2 targets to reveal the targets of small isoform. These data show the contribution of chromatin-interacting domain in protein recruitment to its binding sites in the genome.
Biochemistry, Issue 41, chromatin immunoprecipitation, ChIP-Seq, RBP2, JARID1A, KDM5A, isoform-specific recruitment
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
Chromatin Isolation by RNA Purification (ChIRP)
Institutions: Stanford University School of Medicine.
Long noncoding RNAs are key regulators of chromatin states for important biological processes such as dosage compensation, imprinting, and developmental gene expression 1,2,3,4,5,6,7
. The recent discovery of thousands of lncRNAs in association with specific chromatin modification complexes, such as Polycomb Repressive Complex 2 (PRC2) that mediates histone H3 lysine 27 trimethylation (H3K27me3), suggests broad roles for numerous lncRNAs in managing chromatin states in a gene-specific fashion 8,9
. While some lncRNAs are thought to work in cis on neighboring genes, other lncRNAs work in trans to regulate distantly located genes. For instance, Drosophila
lncRNAs roX1 and roX2 bind numerous regions on the X chromosome of male cells, and are critical for dosage compensation 10,11
. However, the exact locations of their binding sites are not known at high resolution. Similarly, human lncRNA HOTAIR can affect PRC2 occupancy on hundreds of genes genome-wide 3,12,13
, but how specificity is achieved is unclear. LncRNAs can also serve as modular scaffolds to recruit the assembly of multiple protein complexes. The classic trans-acting RNA scaffold is the TERC RNA that serves as the template and scaffold for the telomerase complex 14
; HOTAIR can also serve as a scaffold for PRC2 and a H3K4 demethylase complex 13
Prior studies mapping RNA occupancy at chromatin have revealed substantial insights 15,16
, but only at a single gene locus at a time. The occupancy sites of most lncRNAs are not known, and the roles of lncRNAs in chromatin regulation have been mostly inferred from the indirect effects of lncRNA perturbation. Just as chromatin immunoprecipitation followed by microarray or deep sequencing (ChIP-chip or ChIP-seq, respectively) has greatly improved our understanding of protein-DNA interactions on a genomic scale, here we illustrate a recently published strategy to map long RNA occupancy genome-wide at high resolution 17
. This method, Chromatin Isolation by RNA Purification (ChIRP) (Figure 1
), is based on affinity capture of target lncRNA:chromatin complex by tiling antisense-oligos, which then generates a map of genomic binding sites at a resolution of several hundred bases with high sensitivity and low background. ChIRP is applicable to many lncRNAs because the design of affinity-probes is straightforward given the RNA sequence and requires no knowledge of the RNA's structure or functional domains.
Genetics, Issue 61, long noncoding RNA (lncRNA), genomics, chromatin binding, high-throughput sequencing, ChIRP
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
A Method for Selecting Structure-switching Aptamers Applied to a Colorimetric Gold Nanoparticle Assay
Institutions: Wright-Patterson Air Force Base, The Henry M. Jackson Foundation, UES, Inc..
Small molecules provide rich targets for biosensing applications due to their physiological implications as biomarkers of various aspects of human health and performance. Nucleic acid aptamers have been increasingly applied as recognition elements on biosensor platforms, but selecting aptamers toward small molecule targets requires special design considerations. This work describes modification and critical steps of a method designed to select structure-switching aptamers to small molecule targets. Binding sequences from a DNA library hybridized to complementary DNA capture probes on magnetic beads are separated from nonbinders via a target-induced change in conformation. This method is advantageous because sequences binding the support matrix (beads) will not be further amplified, and it does not require immobilization of the target molecule. However, the melting temperature of the capture probe and library is kept at or slightly above RT, such that sequences that dehybridize based on thermodynamics will also be present in the supernatant solution. This effectively limits the partitioning efficiency (ability to separate target binding sequences from nonbinders), and therefore many selection rounds will be required to remove background sequences. The reported method differs from previous structure-switching aptamer selections due to implementation of negative selection steps, simplified enrichment monitoring, and extension of the length of the capture probe following selection enrichment to provide enhanced stringency. The selected structure-switching aptamers are advantageous in a gold nanoparticle assay platform that reports the presence of a target molecule by the conformational change of the aptamer. The gold nanoparticle assay was applied because it provides a simple, rapid colorimetric readout that is beneficial in a clinical or deployed environment. Design and optimization considerations are presented for the assay as proof-of-principle work in buffer to provide a foundation for further extension of the work toward small molecule biosensing in physiological fluids.
Molecular Biology, Issue 96, Aptamer, structure-switching, SELEX, small molecule, cortisol, next generation sequencing, gold nanoparticle, assay