Translate text to:
In JoVE (1)
Other Publications (41)
- Molecular Cell
- Science (New York, N.Y.)
- The EMBO Journal
- Proceedings of the National Academy of Sciences of the United States of America
- Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
- Bioinformatics (Oxford, England)
- BMC Bioinformatics
- Nature Biotechnology
- Nature Genetics
- BMC Genomics
- Nature Genetics
- Molecular Systems Biology
- Clinical Cancer Research : an Official Journal of the American Association for Cancer Research
- Bioinformatics (Oxford, England)
- Genome Biology
- Investigative Ophthalmology & Visual Science
- Proceedings of the National Academy of Sciences of the United States of America
- Genome Research
- Nucleic Acids Research
- Cancer Research
- Nature Structural & Molecular Biology
- Molecular Systems Biology
- Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
- Chromosome Research : an International Journal on the Molecular, Supramolecular and Evolutionary Aspects of Chromosome Biology
- Genome Research
- Cancer Cell
- PLoS Genetics
- BMC Genomics
- Wiley Interdisciplinary Reviews. Systems Biology and Medicine
- Bioinformatics (Oxford, England)
- Biophysical Journal
- Nature Reviews. Genetics
- PloS One
- PloS One
- BioMed Research International
- Molecular Cell
- BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology
- Nucleic Acids Research
- Cell Cycle (Georgetown, Tex.)
Articles by Itamar Simon in JoVE
Genome-wide Determination of Mammalian Replication Timing by DNA Content Measurement
Yishai Yehuda1, Britny Blumenfeld1, Dan Lehmann2, Itamar Simon1
1Dept. of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, 2The Core Research Facility, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem
Other articles by Itamar Simon on PubMed
The Genome-wide Localization of Rsc9, a Component of the RSC Chromatin-remodeling Complex, Changes in Response to Stress
Molecular Cell. Mar, 2002 | Pubmed ID: 11931764
The cellular response to environmental changes includes widespread modifications in gene expression. Here we report the identification and characterization of Rsc9, a member of the RSC chromatin-remodeling complex in yeast. The genome-wide localization of Rsc9 indicated a relationship between genes targeted by Rsc9 and genes regulated by stress; treatment with hydrogen peroxide or rapamycin, which inhibits TOR signaling, resulted in genome-wide changes in Rsc9 occupancy. We further show that Rsc9 is involved in both repression and activation of mRNAs regulated by TOR as well as the synthesis of rRNA. Our results illustrate the response of a chromatin-remodeling factor to signaling cascades and suggest that changes in the activity of chromatin-remodeling factors are reflected in changes in their localization in the genome.
Science (New York, N.Y.). Oct, 2002 | Pubmed ID: 12399584
We have determined how most of the transcriptional regulators encoded in the eukaryote Saccharomyces cerevisiae associate with genes across the genome in living cells. Just as maps of metabolic networks describe the potential pathways that may be used by a cell to accomplish metabolic processes, this network of regulator-gene interactions describes potential pathways yeast cells can use to regulate global gene expression programs. We use this information to identify network motifs, the simplest units of network architecture, and demonstrate that an automated process can use motifs to assemble a transcriptional regulatory network structure. Our results reveal that eukaryotic cellular functions are highly connected through networks of transcriptional regulators that regulate other transcriptional regulators.
Program-specific Distribution of a Transcription Factor Dependent on Partner Transcription Factor and MAPK Signaling
Cell. May, 2003 | Pubmed ID: 12732146
Specialized gene expression programs are induced by signaling pathways that act on transcription factors. Whether these transcription factors can function in multiple developmental programs through a global switch in promoter selection is not known. We have used genome-wide location analysis to show that the yeast Ste12 transcription factor, which regulates mating and filamentous growth, is bound to distinct program-specific target genes dependent on the developmental condition. This condition-dependent distribution of Ste12 requires concurrent binding of the transcription factor Tec1 during filamentation and is differentially regulated by the MAP kinases Fus3 and Kss1. Program-specific distribution across the genome may be a general mechanism by which transcription factors regulate distinct gene expression programs in response to signaling.
Systematic Analysis of Essential Yeast TAFs in Genome-wide Transcription and Preinitiation Complex Assembly
The EMBO Journal. Jul, 2003 | Pubmed ID: 12840001
The general transcription factor TFIID is composed of the TATA box binding protein (TBP) and a set of conserved TBP-associated factors (TAFs). Here we report the completion of genome-wide expression profiling analyses of yeast strains bearing temperature-sensitive mutations in each of the 13 essential TAFs. The percentage of the yeast genome dependent on each TAF ranges from 3% (TAF2) to 59-61% (TAF9). Approximately 84% of yeast genes are dependent upon one or more TAFs and 16% of yeast genes are TAF independent. In addition, this complete analysis defines three distinct classes of yeast promoters whose transcriptional requirements for TAFs differ substantially. Using this collection of temperature-sensitive mutants, we show that in all cases the transcriptional dependence for a TAF can be explained by a requirement for TBP recruitment and assembly of the preinitiation complex (PIC). Unexpectedly, these assembly experiments reveal that TAF11 and TAF13 appear to provide the critical functional contacts with TBP during PIC assembly. Collectively, our results confirm and extend the proposal that individual TAFs have selective transcriptional roles and distinct functions.
Comparing the Continuous Representation of Time-series Expression Profiles to Identify Differentially Expressed Genes
Proceedings of the National Academy of Sciences of the United States of America. Sep, 2003 | Pubmed ID: 12934016
We present a general algorithm to detect genes differentially expressed between two nonhomogeneous time-series data sets. As increasing amounts of high-throughput biological data become available, a major challenge in genomic and computational biology is to develop methods for comparing data from different experimental sources. Time-series whole-genome expression data are a particularly valuable source of information because they can describe an unfolding biological process such as the cell cycle or immune response. However, comparisons of time-series expression data sets are hindered by biological and experimental inconsistencies such as differences in sampling rate, variations in the timing of biological processes, and the lack of repeats. Our algorithm overcomes these difficulties by using a continuous representation for time-series data and combining a noise model for individual samples with a global difference measure. We introduce a corresponding statistical method for computing the significance of this differential expression measure. We used our algorithm to compare cell-cycle-dependent gene expression in wild-type and knockout yeast strains. Our algorithm identified a set of 56 differentially expressed genes, and these results were validated by using independent protein-DNA-binding data. Unlike previous methods, our algorithm was also able to identify 22 non-cell-cycle-regulated genes as differentially expressed. This set of genes is significantly correlated in a set of independent expression experiments, suggesting additional roles for the transcription factors Fkh1 and Fkh2 in controlling cellular activity in yeast.
Journal of Computational Biology : a Journal of Computational Molecular Cell Biology. 2003 | Pubmed ID: 12935332
We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time points can be reconstructed using our method with 10-15% less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable low-error alignments on real expression data and further show a specific application to yeast knock-out data that produces biologically meaningful results.
Bioinformatics (Oxford, England). Aug, 2004 | Pubmed ID: 15262777
In the study of many systems, cells are first synchronized so that a large population of cells exhibit similar behavior. While synchronization can usually be achieved for a short duration, after a while cells begin to lose their synchronization. Synchronization loss is a continuous process and so the observed value in a population of cells for a gene at time t is actually a convolution of its values in an interval around t. Deconvolving the observed values from a mixed population will allow us to obtain better models for these systems and to accurately detect the genes that participate in these systems.
BMC Bioinformatics. Jan, 2005 | Pubmed ID: 15661078
High-throughput genomic research tools are becoming standard in the biologist's toolbox. After processing the genomic data with one of the many available statistical algorithms to identify statistically significant genes, these genes need to be further analyzed for biological significance in light of all the existing knowledge. Literature mining--the process of representing literature data in a fashion that is easy to relate to genomic data--is one solution to this problem.
Nature Biotechnology. Dec, 2005 | Pubmed ID: 16333294
Expression profiling of time-series experiments is widely used to study biological systems. However, determining the quality of the resulting profiles remains a fundamental problem. Because of inadequate sampling rates, the effect of arrest-and-release methods and loss of synchronization, the measurements obtained from a series of time points may not accurately represent the underlying expression profiles. To solve this, we propose an approach that combines time-series and static (average) expression data analysis--for each gene, we determine whether its temporal expression profile can be reconciled with its static expression levels. We show that by combining synchronized and unsynchronized human cell cycle data, we can identify many cycling genes that are missed when using only time-series data. The algorithm also correctly distinguishes cycling genes from genes that specifically react to an environmental stimulus even if they share similar temporal expression profiles. Experimental validation of these results shows the utility of this analytical approach for determining the accuracy of gene expression patterns.
Nature Genetics. Feb, 2006 | Pubmed ID: 16444255
DNA methylation has a role in the regulation of gene expression during normal mammalian development but can also mediate epigenetic silencing of CpG island genes in cancer and other diseases. Many individual genes (including tumor suppressors) have been shown to undergo de novo methylation in specific tumor types, but the biological logic inherent in this process is not understood. To decipher this mechanism, we have adopted a new approach for detecting CpG island DNA methylation that can be used together with microarray technology. Genome-wide analysis by this technique demonstrated that tumor-specific methylated genes belong to distinct functional categories, have common sequence motifs in their promoters and are found in clusters on chromosomes. In addition, many are already repressed in normal cells. These results are consistent with the hypothesis that cancer-related de novo methylation may come about through an instructive mechanism.
BMC Genomics. Jun, 2006 | Pubmed ID: 16753054
On most common microarray platforms many genes are represented by multiple probes. Although this is quite common no one has systematically explored the concordance between probes mapped to the same gene.
Polycomb-mediated Methylation on Lys27 of Histone H3 Pre-marks Genes for De Novo Methylation in Cancer
Nature Genetics. Feb, 2007 | Pubmed ID: 17200670
Many genes associated with CpG islands undergo de novo methylation in cancer. Studies have suggested that the pattern of this modification may be partially determined by an instructive mechanism that recognizes specifically marked regions of the genome. Using chromatin immunoprecipitation analysis, here we show that genes methylated in cancer cells are specifically packaged with nucleosomes containing histone H3 trimethylated on Lys27. This chromatin mark is established on these unmethylated CpG island genes early in development and then maintained in differentiated cell types by the presence of an EZH2-containing Polycomb complex. In cancer cells, as opposed to normal cells, the presence of this complex brings about the recruitment of DNA methyl transferases, leading to de novo methylation. These results suggest that tumor-specific targeting of de novo methylation is pre-programmed by an established epigenetic system that normally has a role in marking embryonic genes for repression.
Molecular Systems Biology. 2007 | Pubmed ID: 17224918
Even simple organisms have the ability to respond to internal and external stimuli. This response is carried out by a dynamic network of protein-DNA interactions that allows the specific regulation of genes needed for the response. We have developed a novel computational method that uses an input-output hidden Markov model to model these regulatory networks while taking into account their dynamic nature. Our method works by identifying bifurcation points, places in the time series where the expression of a subset of genes diverges from the rest of the genes. These points are annotated with the transcription factors regulating these transitions resulting in a unified temporal map. Applying our method to study yeast response to stress, we derive dynamic models that are able to recover many of the known aspects of these responses. Predictions made by our method have been experimentally validated leading to new roles for Ino4 and Gcn4 in controlling yeast response to stress. The temporal cascade of factors reveals common pathways and highlights differences between master and secondary factors in the utilization of network motifs and in condition-specific regulation.
Clinical Cancer Research : an Official Journal of the American Association for Cancer Research. Jul, 2007 | Pubmed ID: 17634531
Mammalian heparanase degrades heparan sulfate, the main polysaccharide of the basement membrane. Heparanase is an important determinant in cancer progression, acting via the breakdown of extracellular barriers for invasion, as well as release of heparan sulfate-bound angiogenic and growth-promoting factors. The present study was undertaken to elucidate molecular mechanisms responsible for heparanase overexpression in breast cancer.
Bioinformatics (Oxford, England). Jul, 2007 | Pubmed ID: 17646331
When analyzing expression experiments, researchers are often interested in identifying the set of biological processes that are up- or down-regulated under the experimental condition studied. Current approaches, including clustering expression profiles and averaging the expression profiles of genes known to participate in specific processes, fail to provide an accurate estimate of the activity levels of many biological processes.
Genome Biology. 2007 | Pubmed ID: 17650318
Global transcript levels throughout the cell cycle have been characterized using microarrays in several species. Early analysis of these experiments focused on individual species. More recently, a number of studies have concluded that a surprisingly small number of genes conserved in two or more species are periodically transcribed in these species. Combining and comparing data from multiple species is challenging because of noise in expression data, the different synchronization and scoring methods used, and the need to determine an accurate set of homologs.
Investigative Ophthalmology & Visual Science. Nov, 2007 | Pubmed ID: 17962435
The liver is the most common site of systemic metastases from uveal melanoma (UM). Such metastases usually continue to develop despite the application of current treatment modalities. This study was conducted to obtain insight into the molecular pathways that underlie the development of UM metastasis and thus to identify potential novel therapeutic pathways for this disease.
Genome-wide Transcriptional Analysis of the Human Cell Cycle Identifies Genes Differentially Regulated in Normal and Cancer Cells
Proceedings of the National Academy of Sciences of the United States of America. Jan, 2008 | Pubmed ID: 18195366
Characterization of the transcriptional regulatory network of the normal cell cycle is essential for understanding the perturbations that lead to cancer. However, the complete set of cycling genes in primary cells has not yet been identified. Here, we report the results of genome-wide expression profiling experiments on synchronized primary human foreskin fibroblasts across the cell cycle. Using a combined experimental and computational approach to deconvolve measured expression values into "single-cell" expression profiles, we were able to overcome the limitations inherent in synchronizing nontransformed mammalian cells. This allowed us to identify 480 periodically expressed genes in primary human foreskin fibroblasts. Analysis of the reconstructed primary cell profiles and comparison with published expression datasets from synchronized transformed cells reveals a large number of genes that cycle exclusively in primary cells. This conclusion was supported by both bioinformatic analysis and experiments performed on other cell types. We suggest that this approach will help pinpoint genetic elements contributing to normal cell growth and cellular transformation.
Genome Research. Oct, 2008 | Pubmed ID: 18669478
The division of genomes into distinct replication time zones has long been established. However, an in-depth understanding of their organization and their relationship to transcription is incomplete. Taking advantage of a novel synchronization method ("baby machine") and of genomic DNA microarrays, we have, for the first time, mapped replication times of the entire mouse genome at a high temporal resolution. Our data revealed that although most of the genome has a distinct time of replication either early, middle, or late S phase, a significant portion of the genome is replicated asynchronously. Analysis of the replication map revealed the genomic scale organization of the replication time zones. We found that the genomic regions between early and late replication time zones often consist of extremely large replicons. Analysis of the relationship between replication and transcription revealed that early replication is frequently correlated with the transcription potential of a gene and not necessarily with its actual transcriptional activity. These findings, along with the strong conservation found between replication timing in human and mouse genomes, emphasize the importance of replication timing in transcription regulation.
Nucleic Acids Research. Oct, 2008 | Pubmed ID: 18676451
The Gene Ontology (GO) is extensively used to analyze all types of high-throughput experiments. However, researchers still face several challenges when using GO and other functional annotation databases. One problem is the large number of multiple hypotheses that are being tested for each study. In addition, categories often overlap with both direct parents/descendents and other distant categories in the hierarchical structure. This makes it hard to determine if the identified significant categories represent different functional outcomes or rather a redundant view of the same biological processes. To overcome these problems we developed a generative probabilistic model which identifies a (small) subset of categories that, together, explain the selected gene set. Our model accommodates noise and errors in the selected gene set and GO. Using controlled GO data our method correctly recovered most of the selected categories, leading to dramatic improvements over current methods for GO analysis. When used with microarray expression data and ChIP-chip data from yeast and human our method was able to correctly identify both general and specific enriched categories which were overlooked by other methods.
Chromatin Immunoprecipitation-on-chip Reveals Stress-dependent P53 Occupancy in Primary Normal Cells but Not in Established Cell Lines
Cancer Research. Dec, 2008 | Pubmed ID: 19047144
The p53 tumor suppressor protein is a transcription factor that plays a key role in the cellular response to stress and cancer prevention. Upon activation, p53 regulates a large variety of genes causing cell cycle arrest, apoptosis, or senescence. We have developed a p53-focused array, which allows us to investigate, simultaneously, p53 interactions with most of its known target sequences using the chromatin immunoprecipitation (ChIP)-on-chip methodology. Applying this technique to multiple cell types under various growth conditions revealed a profound difference in p53 activity between primary cells and established cell lines. We found that, in peripheral blood mononuclear cells, p53 exists in a form that binds only a small subset of its target regions. Upon exposure to genotoxic stress, the extent of targets bound by p53 significantly increased. By contrast, in established cell lines, p53 binds to essentially all of its targets irrespective of stress and cellular fate (apoptosis or arrest). Analysis of gene expression in these established lines revealed little correlation between DNA binding and the induction of gene expression. Our results suggest that nonactivated p53 has limited binding activity, whereas upon activation it binds to essentially all its targets. Additional triggers are most likely required to activate the transcriptional program of p53.
Nature Structural & Molecular Biology. May, 2009 | Pubmed ID: 19377480
CpG island-like sequences are commonly thought to provide the sole signals for designating constitutively unmethylated regions in the genome, thus generating open chromatin domains within a sea of global repression. Using a new database obtained from comprehensive microarray analysis, we show that unmethylated regions (UMRs) seem to be formed during early embryogenesis, not as a result of CpG-ness, but rather through the recognition of specific sequence motifs closely associated with transcription start sites. This same system probably brings about the resetting of pluripotency genes during somatic cell reprogramming. The data also reveal a new class of nonpromoter UMRs that become de novo methylated in a tissue-specific manner during development, and this process may be involved in gene regulation. In short, we show that UMRs are an important aspect of genome structure that have a dynamic role in development.
Molecular Systems Biology. 2009 | Pubmed ID: 19536199
The complementarity of gene expression and protein-DNA interaction data led to several successful models of biological systems. However, recent studies in multiple species raise doubts about the relationship between these two datasets. These studies show that the overwhelming majority of genes bound by a particular transcription factor (TF) are not affected when that factor is knocked out. Here, we show that this surprising result can be partially explained by considering the broader cellular context in which TFs operate. Factors whose functions are not backed up by redundant paralogs show a fourfold increase in the agreement between their bound targets and the expression levels of those targets. In addition, we show that incorporating protein interaction networks provides physical explanations for knockout effects. New double knockout experiments support our conclusions. Our results highlight the robustness provided by redundant TFs and indicate that in the context of diverse cellular systems, binding is still largely functional.
A Combined Expression-interaction Model for Inferring the Temporal Activity of Transcription Factors
Journal of Computational Biology : a Journal of Computational Molecular Cell Biology. Aug, 2009 | Pubmed ID: 19630541
Methods suggested for reconstructing regulatory networks can be divided into two sets based on how the activity level of transcription factors (TFs) is inferred. The first group of methods relies on the expression levels of TFs, assuming that the activity of a TF is highly correlated with its mRNA abundance. The second treats the activity level as unobserved and infers it from the expression of the genes that the TF regulates. While both types of methods were successfully applied, each suffers from drawbacks that limit their accuracy. For the first set, the assumption that mRNA levels are correlated with activity is violated for many TFs due to post-transcriptional modifications. For the second, the expression level of a TF which might be informative is completely ignored. Here we present the post-transcriptional modification model (PTMM) that, unlike previous methods, utilizes both sources of data concurrently. Our method uses a switching model to determine whether a TF is transcriptionally or post-transcriptionally regulated. This model is combined with a factorial HMM to reconstruct the interactions in a dynamic regulatory network. Using simulated and real data, we show that PTMM outperforms the other two approaches discussed above. Using real data, we also show that PTMM can recover meaningful TF activity levels and identify post-transcriptionally modified TFs, many of which are supported by other sources. Supporting website: www.sb.cs.cmu.edu/PTMM/PTMM.html.
Chromosome Research : an International Journal on the Molecular, Supramolecular and Evolutionary Aspects of Chromosome Biology. Jan, 2010 | Pubmed ID: 20205353
Microarray technology has facilitated the research of eukaryotic DNA replication on a genome-wide scale. Recent studies have shed light on the association between time of replication and chromosome structure, on the organization principles of the replication program, and on the correlation between replication timing and transcription. In this review, we summarize various genomic measurement approaches and the biological insights achieved through applying them in the study of the mammalian replication program.
Genome Research. Apr, 2010 | Pubmed ID: 20219943
Information about the binding preferences of many transcription factors is known and characterized by a sequence binding motif. However, determining regions of the genome in which a transcription factor binds based on its motif is a challenging problem, particularly in species with large genomes, since there are often many sequences containing matches to the motif but are not bound. Several rules based on sequence conservation or location, relative to a transcription start site, have been proposed to help differentiate true binding sites from random ones. Other evidence sources may also be informative for this task. We developed a method for integrating multiple evidence sources using logistic regression classifiers. Our method works in two steps. First, we infer a score quantifying the general binding preferences of transcription factor binding at all locations based on a large set of evidence features, without using any motif specific information. Then, we combined this general binding preference score with motif information for specific transcription factors to improve prediction of regions bound by the factor. Using cross-validation and new experimental data we show that, surprisingly, the general binding preference can be highly predictive of true locations of transcription factor binding even when no binding motif is used. When combined with motif information our method outperforms previous methods for predicting locations of true binding.
Cancer Cell. Mar, 2010 | Pubmed ID: 20227041
The p53 gene is mutated in many human tumors. Cells of such tumors often contain abundant mutant p53 (mutp53) protein, which may contribute actively to tumor progression via a gain-of-function mechanism. We applied ChIP-on-chip analysis and identified the vitamin D receptor (VDR) response element as overrepresented in promoter sequences bound by mutp53. We report that mutp53 can interact functionally and physically with VDR. Mutp53 is recruited to VDR-regulated genes and modulates their expression, augmenting the transactivation of some genes and relieving the repression of others. Furthermore, mutp53 increases the nuclear accumulation of VDR. Importantly, mutp53 converts vitamin D into an antiapoptotic agent. Thus, p53 status can determine the biological impact of vitamin D on tumor cells.
Comparative Analysis of DNA Replication Timing Reveals Conserved Large-scale Chromosomal Architecture
PLoS Genetics. Jul, 2010 | Pubmed ID: 20617169
Recent evidence suggests that the timing of DNA replication is coordinated across megabase-scale domains in metazoan genomes, yet the importance of this aspect of genome organization is unclear. Here we show that replication timing is remarkably conserved between human and mouse, uncovering large regions that may have been governed by similar replication dynamics since these species have diverged. This conservation is both tissue-specific and independent of the genomic G+C content conservation. Moreover, we show that time of replication is globally conserved despite numerous large-scale genome rearrangements. We systematically identify rearrangement fusion points and demonstrate that replication time can be locally diverged at these loci. Conversely, rearrangements are shown to be correlated with early replication and physical chromosomal proximity. These results suggest that large chromosomal domains of coordinated replication are shuffled by evolution while conserving the large-scale nuclear architecture of the genome.
Combination of Genomic Approaches with Functional Genetic Experiments Reveals Two Modes of Repression of Yeast Middle-phase Meiosis Genes
BMC Genomics. Aug, 2010 | Pubmed ID: 20716365
Regulation of meiosis and sporulation in Saccharomyces cerevisiae is a model for a highly regulated developmental process. Meiosis middle phase transcriptional regulation is governed by two transcription factors: the activator Ndt80 and the repressor Sum1. It has been suggested that the competition between Ndt80 and Sum1 determines the temporal expression of their targets during middle meiosis.
Wiley Interdisciplinary Reviews. Systems Biology and Medicine. May-Jun, 2010 | Pubmed ID: 20836034
Methylation of cytosines is the key epigenetic modification of DNA in eukaryotes and is associated with a repressed chromatin state and inhibition of gene expression. The methylation pattern in mammalian genomes is bimodal, with most of the genomes methylated except for short DNA stretches called CpG islands (CGIs), which are generally protected from methylation. Recent technical advances have made it possible to map DNA methylation patterns on a large scale. Several genomic studies have made significant progress in unraveling the intricate relationships between DNA methylation, chromatin structure, and gene expression. What is emerging is a more dynamic and complex association between DNA methylation and expression than previously known. Here we highlight several recent genomic studies with an emphasis on what new information is gained from these studies and what conclusions can be reached about the role of DNA methylation in controlling gene expression.
Bioinformatics (Oxford, England). Sep, 2011 | Pubmed ID: 21752801
Motif discovery is now routinely used in high-throughput studies including large-scale sequencing and proteomics. These datasets present new challenges. The first is speed. Many motif discovery methods do not scale well to large datasets. Another issue is identifying discriminative rather than generative motifs. Such discriminative motifs are important for identifying co-factors and for explaining changes in behavior between different conditions.
Biophysical Journal. Apr, 2012 | Pubmed ID: 22768926
Two major classes of small regulatory RNAs--small interfering RNAs (siRNAs) and microRNA (miRNAs)--are involved in a common RNA interference processing pathway. Small RNAs within each of these families were found to compete for limiting amounts of shared components, required for their biogenesis and processing. Association with Argonaute (Ago), the catalytic component of the RNA silencing complex, was suggested as the central mechanistic point in RNA interference machinery competition. Aiming to better understand the competition between small RNAs in the cell, we present a mathematical model and characterize a range of specific cell and experimental parameters affecting the competition. We apply the model to competition between miRNAs and study the change in the expression level of their target genes under a variety of conditions. We show quantitatively that the amount of Ago and miRNAs in the cell are dominant factors contributing greatly to the competition. Interestingly, we observe what to our knowledge is a novel type of competition that takes place when Ago is abundant, by which miRNAs with shared targets compete over them. Furthermore, we use the model to examine different interaction mechanisms that might operate in establishing the miRNA-Ago complexes, mainly those related to their stability and recycling. Our model provides a mathematical framework for future studies of competition effects in regulation mechanisms involving small RNAs.
Nature Reviews. Genetics. Jul, 2012 | Pubmed ID: 22805708
Biological processes are often dynamic, thus researchers must monitor their activity at multiple time points. The most abundant source of information regarding such dynamic activity is time-series gene expression data. These data are used to identify the complete set of activated genes in a biological process, to infer their rates of change, their order and their causal effects and to model dynamic systems in the cell. In this Review we discuss the basic patterns that have been observed in time-series experiments, how these patterns are combined to form expression programs, and the computational analysis, visualization and integration of these data to infer models of dynamic biological systems.
Genome-wide Analysis of Androgen Receptor Targets Reveals COUP-TF1 As a Novel Player in Human Prostate Cancer
PloS One. 2012 | Pubmed ID: 23056316
Androgen activity plays a key role in prostate cancer progression. Androgen receptor (AR) is the main mediator of androgen activity in the prostate, through its ability to act as a transcription mediator. Here we performed a genome-wide analysis of human AR binding to promoters in the presence of an agonist or antagonist in an androgen dependent prostate cancer cell line. Many of the AR bound promoters are bound in all examined conditions while others are bound only in the presence of an agonist or antagonist. Several motifs are enriched in AR bound promoters, including the AR Response Element (ARE) half-site and recognition elements for the transcription factors OCT1 and SOX9. This suggests that these 3 factors could define a module of co-operating transcription factors in the prostate. Interestingly, AR bound promoters are preferentially located in AT rich genomic regions. Analysis of mRNA expression identified chicken ovalbumin upstream promoter-transcription factor 1 (COUP-TF1) as a direct AR target gene that is downregulated upon binding by the agonist liganded AR. COUP-TF1 immunostaining revealed nucleolar localization of COUP-TF1 in epithelium of human androgen dependent prostate cancer, but not in adjacent benign prostate epithelium. Stromal cells both in human and mouse prostate show nuclear COUP-TF1 staining. We further show that there is an inverse correlation between COUP-TF1 expression in prostate stromal cells and the rising levels of androgen with advancing puberty. This study extends the pool of recognized putative AR targets and identifies a negatively regulated target of AR - COUP-TF1 - which could possibly play a role in human prostate cancer.
Systematic Determination of Replication Activity Type Highlights Interconnections Between Replication, Chromatin Structure and Nuclear Localization
PloS One. 2012 | Pubmed ID: 23145042
DNA replication is a highly regulated process, with each genomic locus replicating at a distinct time of replication (ToR). Advances in ToR measurement technology enabled several genome-wide profiling studies that revealed tight associations between ToR and general genomic features and a remarkable ToR conservation in mammals. Genome wide studies further showed that at the hundreds kb-to-megabase scale the genome can be divided into constant ToR regions (CTRs) in which the replication process propagates at a faster pace due to the activation of multiple origins and temporal transition regions (TTRs) in which the replication process propagates at a slower pace. We developed a computational tool that assigns a ToR to every measured locus and determines its replication activity type (CTR versus TTR). Our algorithm, ARTO (Analysis of Replication Timing and Organization), uses signal processing methods to fit a constant piece-wise linear curve to the measured raw data. We tested our algorithm and provide performance and usability results. A Matlab implementation of ARTO is available at http://bioinfo.cs.technion.ac.il/people/zohar/ARTO/. Applying our algorithm to ToR data measured in multiple mouse and human samples allowed precise genome-wide ToR determination and replication activity type characterization. Analysis of the results highlighted the plasticity of the replication program. For example, we observed significant ToR differences in 10-25% of the genome when comparing different tissue types. Our analyses also provide evidence for activity type differences in up to 30% of the probes. Integration of the ToR data with multiple aspects of chromosome organization characteristics suggests that ToR plays a role in shaping the regional chromatin structure. Namely, repressive chromatin marks, are associated with late ToR both in TTRs and CTRs. Finally, characterization of the differences between TTRs and CTRs, with matching ToR, revealed that TTRs are associated with compact chromatin and are located significantly closer to the nuclear envelope. Supplementary material is available. Raw and processed data were deposited in Geo (GSE17236).
Integrative Analysis of Methylome and Transcriptome Reveals the Importance of Unmethylated CpGs in Non-CpG Island Gene Activation
BioMed Research International. 2013 | Pubmed ID: 23936848
Promoter methylation is associated with gene repression; however, little is known about its mechanism. It was proposed that the repression of methylated genes is achieved through the recruitment of methyl binding proteins (MBPs) that participate in closing the chromatin. An alternative mechanism suggests that methylation interferes with the binding of either site specific activators or more general activators that bind to the CpG dinucleotide. However, the relative contribution of these two mechanisms to gene repression is not known.
Nature. Mar, 2015 | Pubmed ID: 25762143
Stochastic processes in cells are associated with fluctuations in mRNA, protein production and degradation, noisy partition of cellular components at division, and other cell processes. Variability within a clonal population of cells originates from such stochastic processes, which may be amplified or reduced by deterministic factors. Cell-to-cell variability, such as that seen in the heterogeneous response of bacteria to antibiotics, or of cancer cells to treatment, is understood as the inevitable consequence of stochasticity. Variability in cell-cycle duration was observed long ago; however, its sources are still unknown. A central question is whether the variance of the observed distribution originates from stochastic processes, or whether it arises mostly from a deterministic process that only appears to be random. A surprising feature of cell-cycle-duration inheritance is that it seems to be lost within one generation but to be still present in the next generation, generating poor correlation between mother and daughter cells but high correlation between cousin cells. This observation suggests the existence of underlying deterministic factors that determine the main part of cell-to-cell variability. We developed an experimental system that precisely measures the cell-cycle duration of thousands of mammalian cells along several generations and a mathematical framework that allows discrimination between stochastic and deterministic processes in lineages of cells. We show that the inter- and intra-generation correlations reveal complex inheritance of the cell-cycle duration. Finally, we build a deterministic nonlinear toy model for cell-cycle inheritance that reproduces the main features of our data. Our approach constitutes a general method to identify deterministic variability in lineages of cells or organisms, which may help to predict and, eventually, reduce cell-to-cell heterogeneity in various systems, such as cancer cells under treatment.
The P53 C Terminus Controls Site-specific DNA Binding and Promotes Structural Changes Within the Central DNA Binding Domain
Molecular Cell. Mar, 2015 | Pubmed ID: 25794615
DNA binding by numerous transcription factors including the p53 tumor suppressor protein constitutes a vital early step in transcriptional activation. While the role of the central core DNA binding domain (DBD) of p53 in site-specific DNA binding has been established, the contribution of the sequence-independent C-terminal domain (CTD) is still not well understood. We investigated the DNA-binding properties of a series of p53 CTD variants using a combination of in vitro biochemical analyses and in vivo binding experiments. Our results provide several unanticipated and interconnected findings. First, the CTD enables DNA binding in a sequence-dependent manner that is drastically altered by either its modification or deletion. Second, dependence on the CTD correlates with the extent to which the p53 binding site deviates from the canonical consensus sequence. Third, the CTD enables stable formation of p53-DNA complexes to divergent binding sites via DNA-induced conformational changes within the DBD itself.
BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology. Jan, 2016 | Pubmed ID: 26628302
We describe a recent approach for distinguishing between stochastic and deterministic sources of variability, focusing on the mammalian cell cycle. Variability between cells is often attributed to stochastic noise, although it may be generated by deterministic components. Interestingly, lineage information can be used to distinguish between variability and determinism. Analysis of correlations within a lineage of the mammalian cell cycle duration revealed its deterministic nature. Here, we discuss the sources of such variability and the possibility that the underlying deterministic process is due to the circadian clock. Finally, we discuss the "kicked cell cycle" model and its implication on the study of the cell cycle in healthy and cancerous tissues.
Nucleic Acids Research. May, 2016 | Pubmed ID: 27085808
Genome sequence compositions and epigenetic organizations are correlated extensively across multiple length scales. Replication dynamics, in particular, is highly correlated with GC content. We combine genome-wide time of replication (ToR) data, topological domains maps and detailed functional epigenetic annotations to study the correlations between replication timing and GC content at multiple scales. We find that the decrease in genomic GC content at large scale late replicating regions can be explained by mutation bias favoring A/T nucleotide, without selection or biased gene conversion. Quantification of the free dNTP pool during the cell cycle is consistent with a mechanism involving replication-coupled mutation spectrum that favors AT nucleotides at late S-phase. We suggest that mammalian GC content composition is shaped by independent forces, globally modulating mutation bias and locally selecting on functional element. Deconvoluting these forces and analyzing them on their native scales is important for proper characterization of complex genomic correlations.
Cell Cycle (Georgetown, Tex.). Dec, 2016 | Pubmed ID: 27801609
The heterogeneous responses of clonal cancer cells to treatment is understood to be caused by several factors, including stochasticity, cell-cycle dynamics, and different micro-environments. In a tumor, cancer cells may encounter fluctuating conditions and transit from a stationary culture to a proliferating state, for example this may occur following treatment. Here, we undertake a quantitative evaluation of the response of single cancerous lymphoblasts (L1210 cells) to various treatments administered during this transition. Additionally, we developed an experimental system, a "Mammalian Mother Machine," that tracks the fate of thousands of mammalian cells over several generations under transient exposure to chemotherapeutic drugs. Using our developed system, we were able to follow the same cell under repeated treatments and continuously track many generations. We found that the dynamics of the transition between stationary and proliferative states are highly variable and affect the response to drug treatment. Using cell-cycle markers, we were able to isolate a subpopulation of persister cells with distinctly higher than average survival probability. The higher survival rate encountered with cell-cycle phase specific drugs was associated with a significantly longer time-till-division, and was reduced by a non cell-cycle specific drug. Our results suggest that the variability of transition times from the stationary to the proliferating state may be an obstacle hampering the effectiveness of drugs and should be taken into account when designing treatment regimens.