Recruitment of transcriptional and epigenetic factors to their targets is a key step in their regulation. Prominently featured in recruitment are the protein domains that bind to specific histone modifications. One such domain is the plant homeodomain (PHD), found in several chromatin-binding proteins. The epigenetic factor RBP2 has multiple PHD domains, however, they have different functions (Figure 4). In particular, the C-terminal PHD domain, found in a RBP2 oncogenic fusion in human leukemia, binds to trimethylated lysine 4 in histone H3 (H3K4me3)1. The transcript corresponding to the RBP2 isoform containing the C-terminal PHD accumulates during differentiation of promonocytic, lymphoma-derived, U937 cells into monocytes2. Consistent with both sets of data, genome-wide analysis showed that in differentiated U937 cells, the RBP2 protein gets localized to genomic regions highly enriched for H3K4me33. Localization of RBP2 to its targets correlates with a decrease in H3K4me3 due to RBP2 histone demethylase activity and a decrease in transcriptional activity. In contrast, two other PHDs of RBP2 are unable to bind H3K4me3. Notably, the C-terminal domain PHD of RBP2 is absent in the smaller RBP2 isoform4. It is conceivable that the small isoform of RBP2, which lacks interaction with H3K4me3, differs from the larger isoform in genomic location. The difference in genomic location of RBP2 isoforms may account for the observed diversity in RBP2 function. Specifically, RBP2 is a critical player in cellular differentiation mediated by the retinoblastoma protein (pRB). Consistent with these data, previous genome-wide analysis, without distinction between isoforms, identified two distinct groups of RBP2 target genes: 1) genes bound by RBP2 in a manner that is independent of differentiation; 2) genes bound by RBP2 in a differentiation-dependent manner.
To identify differences in localization between the isoforms we performed genome-wide location analysis by ChIP-Seq. Using antibodies that detect both RBP2 isoforms we have located all RBP2 targets. Additionally we have antibodies that only bind large, and not small RBP2 isoform (Figure 4). After identifying the large isoform targets, one can then subtract them from all RBP2 targets to reveal the targets of small isoform. These data show the contribution of chromatin-interacting domain in protein recruitment to its binding sites in the genome.
26 Related JoVE Articles!
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
A Chromatin Assay for Human Brain Tissue
Institutions: University of Massachusetts Medical School.
Chronic neuropsychiatric illnesses such as schizophrenia, bipolar disease and autism are thought to result from a combination of genetic and environmental factors that might result in epigenetic alterations of gene expression and other molecular pathology. Traditionally, however, expression studies in postmortem brain were confined to quantification of mRNA or protein. The limitations encountered in postmortem brain research such as variabilities in autolysis time and tissue integrities are also likely to impact any studies of higher order chromatin structures. However, the nucleosomal organization of genomic DNA including DNA:core histone binding - appears to be largely preserved in representative samples provided by various brain banks. Therefore, it is possible to study the methylation pattern and other covalent modifications of the core histones at defined genomic loci in postmortem brain. Here, we present a simplified native chromatin immunoprecipitation (NChIP) protocol for frozen (never-fixed) human brain specimens. Starting with micrococcal nuclease digestion of brain homogenates, NChIP followed by qPCR can be completed within three days. The methodology presented here should be useful to elucidate epigenetic mechanisms of gene expression in normal and diseased human brain.
Neuroscience, Issue 13, Postmortem brain, Nucleosome, Histone, Methylation, Epigenetic, Chromatin, Human Brain
Genome-wide Gene Deletions in Streptococcus sanguinis by High Throughput PCR
Institutions: Virginia Commonwealth University.
Transposon mutagenesis and single-gene deletion are two methods applied in genome-wide gene knockout in bacteria 1,2
. Although transposon mutagenesis is less time consuming, less costly, and does not require completed genome information, there are two weaknesses in this method: (1) the possibility of a disparate mutants in the mixed mutant library that counter-selects mutants with decreased competition; and (2) the possibility of partial gene inactivation whereby genes do not entirely lose their function following the insertion of a transposon. Single-gene deletion analysis may compensate for the drawbacks associated with transposon mutagenesis. To improve the efficiency of genome-wide single gene deletion, we attempt to establish a high-throughput technique for genome-wide single gene deletion using Streptococcus sanguinis
as a model organism. Each gene deletion construct in S. sanguinis
genome is designed to comprise 1-kb upstream of the targeted gene, the aphA-3
gene, encoding kanamycin resistance protein, and 1-kb downstream of the targeted gene. Three sets of primers F1/R1, F2/R2, and F3/R3, respectively, are designed and synthesized in a 96-well plate format for PCR-amplifications of those three components of each deletion construct. Primers R1 and F3 contain 25-bp sequences that are complementary to regions of the aphA-3
gene at their 5' end. A large scale PCR amplification of the aphA-3
gene is performed once for creating all single-gene deletion constructs. The promoter of aphA-3
gene is initially excluded to minimize the potential polar effect of kanamycin cassette. To create the gene deletion constructs, high-throughput PCR amplification and purification are performed in a 96-well plate format. A linear recombinant PCR amplicon for each gene deletion will be made up through four PCR reactions using high-fidelity DNA polymerase. The initial exponential growth phase of S. sanguinis
cultured in Todd Hewitt broth supplemented with 2.5% inactivated horse serum is used to increase competence for the transformation of PCR-recombinant constructs. Under this condition, up to 20% of S. sanguinis
cells can be transformed using ~50 ng of DNA. Based on this approach, 2,048 mutants with single-gene deletion were ultimately obtained from the 2,270 genes in S. sanguinis
excluding four gene ORFs contained entirely within other ORFs in S. sanguinis
SK36 and 218 potential essential genes. The technique on creating gene deletion constructs is high throughput and could be easy to use in genome-wide single gene deletions for any transformable bacteria.
Genetics, Issue 69, Microbiology, Molecular Biology, Biomedical Engineering, Genomics, Streptococcus sanguinis, Streptococcus, Genome-wide gene deletions, genes, High-throughput, PCR
Annotation of Plant Gene Function via Combined Genomics, Metabolomics and Informatics
Given the ever expanding number of model plant species for which complete genome sequences are available and the abundance of bio-resources such as knockout mutants, wild accessions and advanced breeding populations, there is a rising burden for gene functional annotation. In this protocol, annotation of plant gene function using combined co-expression gene analysis, metabolomics and informatics is provided (Figure 1
). This approach is based on the theory of using target genes of known function to allow the identification of non-annotated genes likely to be involved in a certain metabolic process, with the identification of target compounds via metabolomics. Strategies are put forward for applying this information on populations generated by both forward and reverse genetics approaches in spite of none of these are effortless. By corollary this approach can also be used as an approach to characterise unknown peaks representing new or specific secondary metabolites in the limited tissues, plant species or stress treatment, which is currently the important trial to understanding plant metabolism.
Plant Biology, Issue 64, Genetics, Bioinformatics, Metabolomics, Plant metabolism, Transcriptome analysis, Functional annotation, Computational biology, Plant biology, Theoretical biology, Spectroscopy and structural analysis
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Generation of Enterobacter sp. YSU Auxotrophs Using Transposon Mutagenesis
Institutions: Youngstown State University.
Prototrophic bacteria grow on M-9 minimal salts medium supplemented with glucose (M-9 medium), which is used as a carbon and energy source. Auxotrophs can be generated using a transposome. The commercially available, Tn5
-derived transposome used in this protocol consists of a linear segment of DNA containing an R6Kγ
replication origin, a gene for kanamycin resistance and two mosaic sequence ends, which serve as transposase binding sites. The transposome, provided as a DNA/transposase protein complex, is introduced by electroporation into the prototrophic strain, Enterobacter
sp. YSU, and randomly incorporates itself into this host’s genome. Transformants are replica plated onto Luria-Bertani agar plates containing kanamycin, (LB-kan) and onto M-9 medium agar plates containing kanamycin (M-9-kan). The transformants that grow on LB-kan plates but not on M-9-kan plates are considered to be auxotrophs. Purified genomic DNA from an auxotroph is partially digested, ligated and transformed into a pir+ Escherichia coli
) strain. The R6Kγ
replication origin allows the plasmid to replicate in pir+ E. coli
strains, and the kanamycin resistance marker allows for plasmid selection. Each transformant possesses a new plasmid containing the transposon flanked by the interrupted chromosomal region. Sanger sequencing and the Basic Local Alignment Search Tool (BLAST) suggest a putative identity of the interrupted gene. There are three advantages to using this transposome mutagenesis strategy. First, it does not rely on the expression of a transposase gene by the host. Second, the transposome is introduced into the target host by electroporation, rather than by conjugation or by transduction and therefore is more efficient. Third, the R6Kγ
replication origin makes it easy to identify the mutated gene which is partially recovered in a recombinant plasmid. This technique can be used to investigate the genes involved in other characteristics of Enterobacter
sp. YSU or of a wider variety of bacterial strains.
Microbiology, Issue 92, Auxotroph, transposome, transposon, mutagenesis, replica plating, glucose minimal medium, complex medium, Enterobacter
Mouse Genome Engineering Using Designer Nucleases
Institutions: University of Zurich, University of Minnesota.
Transgenic mice carrying site-specific genome modifications (knockout, knock-in) are of vital importance for dissecting complex biological systems as well as for modeling human diseases and testing therapeutic strategies. Recent advances in the use of designer nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system for site-specific genome engineering open the possibility to perform rapid targeted genome modification in virtually any laboratory species without the need to rely on embryonic stem (ES) cell technology. A genome editing experiment typically starts with identification of designer nuclease target sites within a gene of interest followed by construction of custom DNA-binding domains to direct nuclease activity to the investigator-defined genomic locus. Designer nuclease plasmids are in vitro
transcribed to generate mRNA for microinjection of fertilized mouse oocytes. Here, we provide a protocol for achieving targeted genome modification by direct injection of TALEN mRNA into fertilized mouse oocytes.
Genetics, Issue 86, Oocyte microinjection, Designer nucleases, ZFN, TALEN, Genome Engineering
Serial Enrichment of Spermatogonial Stem and Progenitor Cells (SSCs) in Culture for Derivation of Long-term Adult Mouse SSC Lines
Institutions: Weill Cornell Medical College .
Spermatogonial stem and progenitor cells (SSCs) of the testis represent a classic example of adult mammalian stem cells and preserve fertility for nearly the lifetime of the animal. While the precise mechanisms that govern self-renewal and differentiation in vivo
are challenging to study, various systems have been developed previously to propagate murine SSCs in vitro
using a combination of specialized culture media and feeder cells1-3
Most in vitro
forays into the biology of SSCs have derived cell lines from neonates, possibly due to the difficulty in obtaining adult cell lines4
. However, the testis continues to mature up until ~5 weeks of age in most mouse strains. In the early post-natal period, dramatic changes occur in the architecture of the testis and in the biology of both somatic and spermatogenic cells, including alterations in expression levels of numerous stem cell-related genes. Therefore, neonatally-derived SSC lines may not fully recapitulate the biology of adult SSCs that persist after the adult testis has reached a steady state.
Several factors have hindered the production of adult SSC lines historically. First, the proportion of functional stem cells may decrease during adulthood, either due to intrinsic or extrinsic factors5,6
. Furthermore, as with other adult stem cells, it has been difficult to enrich SSCs sufficiently from total adult testicular cells without using a combination of immunoselection or other sorting strategies7
. Commonly employed strategies include the use of cryptorchid mice as a source of donor cells due to a higher ratio of stem cells to other cell types8
. Based on the hypothesis that removal of somatic cells from the initial culture disrupts interactions with the stem cell niche that are essential for SSC survival, we previously developed methods to derive adult lines that do not require immunoselection or cryptorchid donors but rather employ serial enrichment of SSCs in culture, referred to hereafter as SESC2,3
The method described below entails a simple procedure for deriving adult SSC lines by dissociating adult donor seminiferous tubules, followed by plating of cells on feeders comprised of a testicular stromal cell line (JK1)3
. Through serial passaging, strongly adherent, contaminating non-germ cells are depleted from the culture with concomitant enrichment of SSCs. Cultures produced in this manner contain a mixture of spermatogonia at different stages of differentiation, which contain SSCs, based on long-term self renewal capability. The crux of the SESC method is that it enables SSCs to make the difficult transition from self-renewal in vivo
to long-term self-renewal in vitro
in a radically different microenvironment, produces long-term SSC lines, free of contaminating somatic cells, and thereby enables subsequent experimental manipulation of SSCs.
Stem Cell Biology, Issue 72, Molecular Biology, Cellular Biology, Medicine, Genetics, Developmental Biology, Anatomy, Surgery, Spermatogonial Stem cells, Stem cells, feeder cells, germ cells, testis, cell culture, microenvironment, stem cell niche, progenitor cells, mice, transgenic mice, animal model
Chromatin Immunoprecipitation (ChIP) using Drosophila tissue
Institutions: Johns Hopkins University.
Epigenetics remains a rapidly developing field that studies how the chromatin state contributes to differential gene expression in distinct cell types at different developmental stages. Epigenetic regulation contributes to a broad spectrum of biological processes, including cellular differentiation during embryonic development and homeostasis in adulthood. A critical strategy in epigenetic studies is to examine how various histone modifications and chromatin factors regulate gene expression. To address this, Chromatin Immunoprecipitation (ChIP) is used widely to obtain a snapshot of the association of particular factors with DNA in the cells of interest.
ChIP technique commonly uses cultured cells as starting material, which can be obtained in abundance and homogeneity to generate reproducible data. However, there are several caveats: First, the environment to grow cells in Petri dish is different from that in vivo
, thus may not reflect the endogenous chromatin state of cells in a living organism. Second, not all types of cells can be cultured ex vivo
. There are only a limited number of cell lines, from which people can obtain enough material for ChIP assay.
Here we describe a method to do ChIP experiment using Drosophila
tissues. The starting material is dissected tissue from a living animal, thus can accurately reflect the endogenous chromatin state. The adaptability of this method with many different types of tissue will allow researchers to address a lot more biologically relevant questions regarding epigenetic regulation in vivo1, 2
. Combining this method with high-throughput sequencing (ChIP-seq) will further allow researchers to obtain an epigenomic landscape.
Genetics, Issue 61, ChIP, Drosophila, testes, q-PCR, high throughput sequencing, epi-genetics
DNA-affinity-purified Chip (DAP-chip) Method to Determine Gene Targets for Bacterial Two component Regulatory Systems
Institutions: Lawrence Berkeley National Laboratory.
methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro
microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris
Genetics, Issue 89, DNA-Affinity-Purified-chip, response regulator, transcription factor binding site, two component system, signal transduction, Desulfovibrio, lactate utilization regulator, ChIP-chip
Genetic Manipulation in Δku80 Strains for Functional Genomic Analysis of Toxoplasma gondii
Institutions: The Geisel School of Medicine at Dartmouth.
Targeted genetic manipulation using homologous recombination is the method of choice for functional genomic analysis to obtain a detailed view of gene function and phenotype(s). The development of mutant strains with targeted gene deletions, targeted mutations, complemented gene function, and/or tagged genes provides powerful strategies to address gene function, particularly if these genetic manipulations can be efficiently targeted to the gene locus of interest using integration mediated by double cross over homologous recombination.
Due to very high rates of nonhomologous recombination, functional genomic analysis of Toxoplasma gondii
has been previously limited by the absence of efficient methods for targeting gene deletions and gene replacements to specific genetic loci. Recently, we abolished the major pathway of nonhomologous recombination in type I and type II strains of T. gondii
by deleting the gene encoding the KU80 protein1,2
. The Δku80
strains behave normally during tachyzoite (acute) and bradyzoite (chronic) stages in vitro
and in vivo
and exhibit essentially a 100% frequency of homologous recombination. The Δku80
strains make functional genomic studies feasible on the single gene as well as on the genome scale1-4
Here, we report methods for using type I and type II Δku80Δhxgprt
strains to advance gene targeting approaches in T. gondii
. We outline efficient methods for generating gene deletions, gene replacements, and tagged genes by targeted insertion or deletion of the hypoxanthine-xanthine-guanine phosphoribosyltransferase (HXGPRT
) selectable marker. The described gene targeting protocol can be used in a variety of ways in Δku80
strains to advance functional analysis of the parasite genome and to develop single strains that carry multiple targeted genetic manipulations. The application of this genetic method and subsequent phenotypic assays will reveal fundamental and unique aspects of the biology of T. gondii
and related significant human pathogens that cause malaria (Plasmodium
sp.) and cryptosporidiosis (Cryptosporidium
Infectious Diseases, Issue 77, Genetics, Microbiology, Infection, Medicine, Immunology, Molecular Biology, Cellular Biology, Biomedical Engineering, Bioengineering, Genomics, Parasitology, Pathology, Apicomplexa, Coccidia, Toxoplasma, Genetic Techniques, Gene Targeting, Eukaryota, Toxoplasma gondii, genetic manipulation, gene targeting, gene deletion, gene replacement, gene tagging, homologous recombination, DNA, sequencing
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
High-throughput Functional Screening using a Homemade Dual-glow Luciferase Assay
Institutions: Massachusetts General Hospital.
We present a rapid and inexpensive high-throughput screening protocol to identify transcriptional regulators of alpha-synuclein, a gene associated with Parkinson's disease. 293T cells are transiently transfected with plasmids from an arrayed ORF expression library, together with luciferase reporter plasmids, in a one-gene-per-well microplate format. Firefly luciferase activity is assayed after 48 hr to determine the effects of each library gene upon alpha-synuclein transcription, normalized to expression from an internal control construct (a hCMV promoter directing Renilla
luciferase). This protocol is facilitated by a bench-top robot enclosed in a biosafety cabinet, which performs aseptic liquid handling in 96-well format. Our automated transfection protocol is readily adaptable to high-throughput lentiviral library production or other functional screening protocols requiring triple-transfections of large numbers of unique library plasmids in conjunction with a common set of helper plasmids. We also present an inexpensive and validated alternative to commercially-available, dual luciferase reagents which employs PTC124, EDTA, and pyrophosphate to suppress firefly luciferase activity prior to measurement of Renilla
luciferase. Using these methods, we screened 7,670 human genes and identified 68 regulators of alpha-synuclein. This protocol is easily modifiable to target other genes of interest.
Cellular Biology, Issue 88, Luciferases, Gene Transfer Techniques, Transfection, High-Throughput Screening Assays, Transfections, Robotics
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Ex vivo Culture of Drosophila Pupal Testis and Single Male Germ-line Cysts: Dissection, Imaging, and Pharmacological Treatment
Institutions: Philipps-Universität Marburg, Philipps-Universität Marburg.
During spermatogenesis in mammals and in Drosophila melanogaster,
male germ cells develop in a series of essential developmental processes. This includes differentiation from a stem cell population, mitotic amplification, and meiosis. In addition, post-meiotic germ cells undergo a dramatic morphological reshaping process as well as a global epigenetic reconfiguration of the germ line chromatin—the histone-to-protamine switch.
Studying the role of a protein in post-meiotic spermatogenesis using mutagenesis or other genetic tools is often impeded by essential embryonic, pre-meiotic, or meiotic functions of the protein under investigation. The post-meiotic phenotype of a mutant of such a protein could be obscured through an earlier developmental block, or the interpretation of the phenotype could be complicated. The model organism Drosophila melanogaster
offers a bypass to this problem: intact testes and even cysts of germ cells dissected from early pupae are able to develop ex vivo
in culture medium. Making use of such cultures allows microscopic imaging of living germ cells in testes and of germ-line cysts. Importantly, the cultivated testes and germ cells also become accessible to pharmacological inhibitors, thereby permitting manipulation of enzymatic functions during spermatogenesis, including post-meiotic stages.
The protocol presented describes how to dissect and cultivate pupal testes and germ-line cysts. Information on the development of pupal testes and culture conditions are provided alongside microscope imaging data of live testes and germ-line cysts in culture. We also describe a pharmacological assay to study post-meiotic spermatogenesis, exemplified by an assay targeting the histone-to-protamine switch using the histone acetyltransferase inhibitor anacardic acid. In principle, this cultivation method could be adapted to address many other research questions in pre- and post-meiotic spermatogenesis.
Developmental Biology, Issue 91,
Ex vivo culture, testis, male germ-line cells, Drosophila, imaging, pharmacological assay
A Toolkit to Enable Hydrocarbon Conversion in Aqueous Environments
Institutions: Delft University of Technology, Delft University of Technology.
This work puts forward a toolkit that enables the conversion of alkanes by Escherichia coli
and presents a proof of principle of its applicability. The toolkit consists of multiple standard interchangeable parts (BioBricks)9
addressing the conversion of alkanes, regulation of gene expression and survival in toxic hydrocarbon-rich environments.
A three-step pathway for alkane degradation was implemented in E. coli
to enable the conversion of medium- and long-chain alkanes to their respective alkanols, alkanals and ultimately alkanoic-acids. The latter were metabolized via the native β-oxidation pathway. To facilitate the oxidation of medium-chain alkanes (C5-C13) and cycloalkanes (C5-C8), four genes (alkB2
) of the alkane hydroxylase system from Gordonia
were transformed into E. coli
. For the conversion of long-chain alkanes (C15-C36), theladA
gene from Geobacillus thermodenitrificans
was implemented. For the required further steps of the degradation process, ADH
and ALDH (
originating from G. thermodenitrificans
) were introduced10,11
. The activity was measured by resting cell assays. For each oxidative step, enzyme activity was observed.
To optimize the process efficiency, the expression was only induced under low glucose conditions: a substrate-regulated promoter, pCaiF, was used. pCaiF is present in E. coli
K12 and regulates the expression of the genes involved in the degradation of non-glucose carbon sources.
The last part of the toolkit - targeting survival - was implemented using solvent tolerance genes, PhPFDα and β, both from Pyrococcus horikoshii
OT3. Organic solvents can induce cell stress and decreased survivability by negatively affecting protein folding. As chaperones, PhPFDα and β improve the protein folding process e.g.
under the presence of alkanes. The expression of these genes led to an improved hydrocarbon tolerance shown by an increased growth rate (up to 50%) in the presences of 10% n
-hexane in the culture medium were observed.
Summarizing, the results indicate that the toolkit enables E. coli
to convert and tolerate hydrocarbons in aqueous environments. As such, it represents an initial step towards a sustainable solution for oil-remediation using a synthetic biology approach.
Bioengineering, Issue 68, Microbiology, Biochemistry, Chemistry, Chemical Engineering, Oil remediation, alkane metabolism, alkane hydroxylase system, resting cell assay, prefoldin, Escherichia coli, synthetic biology, homologous interaction mapping, mathematical model, BioBrick, iGEM
Affinity-based Isolation of Tagged Nuclei from Drosophila Tissues for Gene Expression Analysis
Institutions: Purdue University.
embryonic and larval tissues often contain a highly heterogeneous mixture of cell types, which can complicate the analysis of gene expression in these tissues. Thus, to analyze cell-specific gene expression profiles from Drosophila
tissues, it may be necessary to isolate specific cell types with high purity and at sufficient yields for downstream applications such as transcriptional profiling and chromatin immunoprecipitation. However, the irregular cellular morphology in tissues such as the central nervous system, coupled with the rare population of specific cell types in these tissues, can pose challenges for traditional methods of cell isolation such as laser microdissection and fluorescence-activated cell sorting (FACS). Here, an alternative approach to characterizing cell-specific gene expression profiles using affinity-based isolation of tagged nuclei, rather than whole cells, is described. Nuclei in the specific cell type of interest are genetically labeled with a nuclear envelope-localized EGFP tag using the Gal4/UAS binary expression system. These EGFP-tagged nuclei can be isolated using antibodies against GFP that are coupled to magnetic beads. The approach described in this protocol enables consistent isolation of nuclei from specific cell types in the Drosophila
larval central nervous system at high purity and at sufficient levels for expression analysis, even when these cell types comprise less than 2% of the total cell population in the tissue. This approach can be used to isolate nuclei from a wide variety of Drosophila
embryonic and larval cell types using specific Gal4 drivers, and may be useful for isolating nuclei from cell types that are not suitable for FACS or laser microdissection.
Biochemistry, Issue 85, Gene Expression, nuclei isolation, Drosophila, KASH, GFP, cell-type specific
Modeling Astrocytoma Pathogenesis In Vitro and In Vivo Using Cortical Astrocytes or Neural Stem Cells from Conditional, Genetically Engineered Mice
Institutions: University of North Carolina School of Medicine, University of North Carolina School of Medicine, University of North Carolina School of Medicine, University of North Carolina School of Medicine, University of North Carolina School of Medicine, Emory University School of Medicine, University of North Carolina School of Medicine.
Current astrocytoma models are limited in their ability to define the roles of oncogenic mutations in specific brain cell types during disease pathogenesis and their utility for preclinical drug development. In order to design a better model system for these applications, phenotypically wild-type cortical astrocytes and neural stem cells (NSC) from conditional, genetically engineered mice (GEM) that harbor various combinations of floxed oncogenic alleles were harvested and grown in culture. Genetic recombination was induced in vitro
using adenoviral Cre-mediated recombination, resulting in expression of mutated oncogenes and deletion of tumor suppressor genes. The phenotypic consequences of these mutations were defined by measuring proliferation, transformation, and drug response in vitro
. Orthotopic allograft models, whereby transformed cells are stereotactically injected into the brains of immune-competent, syngeneic littermates, were developed to define the role of oncogenic mutations and cell type on tumorigenesis in vivo
. Unlike most established human glioblastoma cell line xenografts, injection of transformed GEM-derived cortical astrocytes into the brains of immune-competent littermates produced astrocytomas, including the most aggressive subtype, glioblastoma, that recapitulated the histopathological hallmarks of human astrocytomas, including diffuse invasion of normal brain parenchyma. Bioluminescence imaging of orthotopic allografts from transformed astrocytes engineered to express luciferase was utilized to monitor in vivo
tumor growth over time. Thus, astrocytoma models using astrocytes and NSC harvested from GEM with conditional oncogenic alleles provide an integrated system to study the genetics and cell biology of astrocytoma pathogenesis in vitro
and in vivo
and may be useful in preclinical drug development for these devastating diseases.
Neuroscience, Issue 90, astrocytoma, cortical astrocytes, genetically engineered mice, glioblastoma, neural stem cells, orthotopic allograft
Polysome Fractionation and Analysis of Mammalian Translatomes on a Genome-wide Scale
Institutions: McGill University, Karolinska Institutet, McGill University.
mRNA translation plays a central role in the regulation of gene expression and represents the most energy consuming process in mammalian cells. Accordingly, dysregulation of mRNA translation is considered to play a major role in a variety of pathological states including cancer. Ribosomes also host chaperones, which facilitate folding of nascent polypeptides, thereby modulating function and stability of newly synthesized polypeptides. In addition, emerging data indicate that ribosomes serve as a platform for a repertoire of signaling molecules, which are implicated in a variety of post-translational modifications of newly synthesized polypeptides as they emerge from the ribosome, and/or components of translational machinery. Herein, a well-established method of ribosome fractionation using sucrose density gradient centrifugation is described. In conjunction with the in-house developed “anota” algorithm this method allows direct determination of differential translation of individual mRNAs on a genome-wide scale. Moreover, this versatile protocol can be used for a variety of biochemical studies aiming to dissect the function of ribosome-associated protein complexes, including those that play a central role in folding and degradation of newly synthesized polypeptides.
Biochemistry, Issue 87, Cells, Eukaryota, Nutritional and Metabolic Diseases, Neoplasms, Metabolic Phenomena, Cell Physiological Phenomena, mRNA translation, ribosomes,
protein synthesis, genome-wide analysis, translatome, mTOR, eIF4E, 4E-BP1
Targeted Expression of GFP in the Hair Follicle Using Ex Vivo Viral Transduction
Institutions: AntiCancer, Inc..
There are many cell types in the hair follicle, including hair matrix cells which form the hair shaft and stem cells which can initiate the hair shaft during early anagen, the growth phase of the hair cycle, as well as pluripotent stem cells that play a role in hair follicle growth but have the potential to differentiate to non-follicle cells such as neurons. These properties of the hair follicle are discussed. The various cell types of the hair follicle are potential targets for gene therapy. Gene delivery system for the hair follicle using viral vectors or liposomes for gene targeting to the various cell types in the hair follicle and the results obtained are also discussed.
Cellular Biology, Issue 13, Springer Protocols, hair follicles, liposomes, adenovirus, genes, stem cells
Principles of Site-Specific Recombinase (SSR) Technology
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Site-specific recombinase (SSR) technology allows the manipulation of gene structure to explore gene function and has become an integral tool of molecular biology. Site-specific recombinases are proteins that bind to distinct DNA target sequences. The Cre/lox system was first described in bacteriophages during the 1980's. Cre recombinase is a Type I topoisomerase that catalyzes site-specific recombination of DNA between two loxP (locus of X-over P1) sites. The Cre/lox system does not require any cofactors. LoxP sequences contain distinct binding sites for Cre recombinases that surround a directional core sequence where recombination and rearrangement takes place. When cells contain loxP sites and express the Cre recombinase, a recombination event occurs. Double-stranded DNA is cut at both loxP sites by the Cre recombinase, rearranged, and ligated ("scissors and glue"). Products of the recombination event depend on the relative orientation of the asymmetric sequences.
SSR technology is frequently used as a tool to explore gene function. Here the gene of interest is flanked with Cre target sites loxP ("floxed"). Animals are then crossed with animals expressing the Cre recombinase under the control of a tissue-specific promoter. In tissues that express the Cre recombinase it binds to target sequences and excises the floxed gene. Controlled gene deletion allows the investigation of gene function in specific tissues and at distinct time points. Analysis of gene function employing SSR technology --- conditional mutagenesis -- has significant advantages over traditional knock-outs where gene deletion is frequently lethal.
Cellular Biology, Issue 15, Molecular Biology, Site-Specific Recombinase, Cre recombinase, Cre/lox system, transgenic animals, transgenic technology
Building a Better Mosquito: Identifying the Genes Enabling Malaria and Dengue Fever Resistance in A. gambiae and A. aegypti Mosquitoes
Institutions: Johns Hopkins University.
In this interview, George Dimopoulos focuses on the physiological mechanisms used by mosquitoes to combat Plasmodium falciparum and dengue virus infections. Explanation is given for how key refractory genes, those genes conferring resistance to vector pathogens, are identified in the mosquito and how this knowledge can be used to generate transgenic mosquitoes that are unable to carry the malaria parasite or dengue virus.
Cellular Biology, Issue 5, Translational Research, mosquito, malaria, virus, dengue, genetics, injection, RNAi, transgenesis, transgenic
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif