The morphological symmetry of the division process of Escherichia coli is well-known. Recent studies verified that, in optimal growth conditions, most divisions are symmetric, although there are exceptions. We investigate whether such morphological asymmetries in division introduce functional asymmetries between sister cells, and assess the robustness of the symmetry in division to mild chemical stresses and sub-optimal temperatures. First, we show that the difference in size between daughter cells at birth is positively correlated to the difference between the numbers of fluorescent protein complexes inherited from the parent cell. Next, we show that the degree of symmetry in division observed in optimal conditions is robust to mild acidic shift and to mild oxidative stress, but not to sub-optimal temperatures, in that the variance of the difference between the sizes of sister cells at birth is minimized at 37 °C. This increased variance affects the functionality of the cells in that, at sub-optimal temperatures, larger/smaller cells arising from asymmetric divisions exhibit faster/slower division times than the mean population division time, respectively. On the other hand, cells dividing faster do not do so at the cost of morphological symmetry in division. Finally we show that at suboptimal temperatures the mean distance between the nucleoids increases, explaining the increased variance in division. We conclude that the functionality of E. coli cells is not immune to morphological asymmetries at birth, and that the effectiveness of the mechanism responsible for ensuring the symmetry in division weakens at sub-optimal temperatures.
The emergence and increase in the number of multidrug resistant microorganisms have highly increased the need of therapeutic trials, necessitating a deep exploration on novel antimicrobial response tactics. This study is intended to screen and analyze the activity of a novel set of azoderivatives of ?-diketones and their known analogues for antimicrobial properties. The compounds were analyzed to determine their minimum inhibitory concentration. Hit compounds 5-(2-(2-hydroxyphenyl)hydrazono)pyrimidine-2,4,6(1H,3H,5H)-trione (C5), 5-chloro-3-(2-(4,4-dimethyl-2,6-dioxocyclohexylidene)hydrazinyl)-2-hydroxybenzenesulfonic acid (C8), 2-(2-carboxyphenylhydrazo)malononitrile (C11) were then considered in evaluating their effect on transcription, translation and cellular oxidation impact. All three compounds were found to have in vitro inhibitory action on E.coli cell growth. The study also revealed that those compounds have a notable impact on cellular activities. It is determined that the newly synthesized azoderivative of barbituric acid (C8) have maximum growth inhibitory activity among the three compounds considered, characterized by a MIC50 of 0.42 mg/mL. The MS2 reporter system was used to detect the transcriptional response of the bacteria to the treatment with the selected drugs. All three compounds are found to down regulate the transcriptional pathway. The novel compound, C5, showed maximum inhibition of transcription mechanism, followed by C8 and C11. The effect of the compounds on translation was analyzed using a Yellow Fluorescent protein reporter system. All the compounds displayed reductive impact on translation of which C8 was found to the best, exhibiting 8.5 fold repression followed by C5 and C11, respectively. Fluctuations of the Reactive Oxygen Species (ROS) concentrations were investigated upon incubation in hit compounds using ROS sensor protein. All the three compounds were found to contribute to oxidative pathway. C8 is found to have the best oxidative effect than C5 and C11. All experiments were repeated at least twice, the results being verified to be significant using statistical analysis.
Microorganisms often form multicellular structures such as biofilms and structured colonies that can influence the organism's virulence, drug resistance, and adherence to medical devices. Phenotypic classification of these structures has traditionally relied on qualitative scoring systems that limit detailed phenotypic comparisons between strains. Automated imaging and quantitative analysis have the potential to improve the speed and accuracy of experiments designed to study the genetic and molecular networks underlying different morphological traits. For this reason, we have developed a platform that uses automated image analysis and pattern recognition to quantify phenotypic signatures of yeast colonies. Our strategy enables quantitative analysis of individual colonies, measured at a single time point or over a series of time-lapse images, as well as the classification of distinct colony shapes based on image-derived features. Phenotypic changes in colony morphology can be expressed as changes in feature space trajectories over time, thereby enabling the visualization and quantitative analysis of morphological development. To facilitate data exploration, results are plotted dynamically through an interactive Yeast Image Analysis web application (YIMAA; http://yimaa.cs.tut.fi) that integrates the raw and processed images across all time points, allowing exploration of the image-based features and principal components associated with morphological development.
We explore whether the process of multimerization can be used as a means to regulate noise in the abundance of functional protein complexes. Additionally, we analyze how this process affects the mean level of these functional units, response time of a gene, and temporal correlation between the numbers of expressed proteins and of the functional multimers. We show that, although multimerization increases noise by reducing the mean number of functional complexes it can reduce noise in comparison with a monomer, when abundance of the functional proteins are comparable. Alternatively, reduction in noise occurs if both monomeric and multimeric forms of the protein are functional. Moreover, we find that multimerization either increases the response time to external signals or decreases the correlation between number of functional complexes and protein production kinetics. Finally, we show that the results are in agreement with recent genome-wide assessments of cell-to-cell variability in protein numbers and of multimerization in essential and non-essential genes in Escherichia coli, and that the effects of multimerization are tangible at the level of genetic circuits.
Cancer is a broad group of genetic diseases which account for millions of deaths worldwide each year. Cancers are classified by various clinical, pathological and molecular methods, but even within a well-characterized disease, there is a significant inter-patient variability in survival, response to treatment, and other parameters. Especially in molecular level, tumours of the same category can appear significantly dissimilar due to complex combinations of genetic aberrations leading to a similar malignancy. We extended the current classification methods by studying tumour heterogeneity at pathway level.
Cell imaging is becoming an indispensable tool for cell and molecular biology research. However, most processes studied are stochastic in nature, and require the observation of many cells and events. Ideally, extraction of information from these images ought to rely on automatic methods. Here, we propose a novel segmentation method, MAMLE, for detecting cells within dense clusters.
High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation.
Zebrafish embryos have recently been established as a xenotransplantation model of the metastatic behaviour of primary human tumours. Current tools for automated data extraction from the microscope images are restrictive concerning the developmental stage of the embryos, usually require laborious manual image preprocessing, and, in general, cannot characterize the metastasis as a function of the internal organs.
Using a single-RNA detection technique in live Escherichia coli cells, we measure, for each cell, the waiting time for the production of the first RNA under the control of PBAD promoter after induction by arabinose, and subsequent intervals between transcription events. We find that the kinetics of the arabinose intake system affect mean and diversity in RNA numbers, long after induction. We observed the same effect on Plac/ara-1 promoter, which is inducible by arabinose or by IPTG. Importantly, the distribution of waiting times of Plac/ara-1 is indistinguishable from that of PBAD, if and only if induced by arabinose alone. Finally, RNA production under the control of PBAD is found to be a sub-Poissonian process. We conclude that inducer-dependent waiting times affect mean and cell-to-cell diversity in RNA numbers long after induction, suggesting that intake mechanisms have non-negligible effects on the phenotypic diversity of cell populations in natural, fluctuating environments.
Altered expression of oncogenic and tumour-suppressing microRNAs (miRNAs) is widely associated with tumourigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumours. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and examined expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression.
Fusion genes are chromosomal aberrations that are found in many cancers and can be used as prognostic markers and drug targets in clinical practice. Fusions can lead to production of oncogenic fusion proteins or to enhanced expression of oncogenes. Several recent studies have reported that some fusion genes can escape microRNA regulation via 3-untranslated region (3-UTR) deletion. We performed whole transcriptome sequencing to identify fusion genes in glioma and discovered FGFR3-TACC3 fusions in 4 of 48 glioblastoma samples from patients both of mixed European and of Asian descent, but not in any of 43 low-grade glioma samples tested. The fusion, caused by tandem duplication on 4p16.3, led to the loss of the 3-UTR of FGFR3, blocking gene regulation of miR-99a and enhancing expression of the fusion gene. The fusion gene was mutually exclusive with EGFR, PDGFR, or MET amplification. Using cultured glioblastoma cells and a mouse xenograft model, we found that fusion protein expression promoted cell proliferation and tumor progression, while WT FGFR3 protein was not tumorigenic, even under forced overexpression. These results demonstrated that the FGFR3-TACC3 gene fusion is expressed in human cancer and generates an oncogenic protein that promotes tumorigenesis in glioblastoma.
The behavior of genetic motifs is determined not only by the gene-gene interactions, but also by the expression patterns of the constituent genes. Live single-molecule measurements have provided evidence that transcription initiation is a sequential process, whose kinetics plays a key role in the dynamics of mRNA and protein numbers. The extent to which it affects the behavior of cellular motifs is unknown. Here, we examine how the kinetics of transcription initiation affects the behavior of motifs performing filtering in amplitude and frequency domain. We find that the performance of each filter is degraded as transcript levels are lowered. This effect can be reduced by having a transcription process with more steps. In addition, we show that the kinetics of the stepwise transcription initiation process affects features such as filter cutoffs. These results constitute an assessment of the range of behaviors of genetic motifs as a function of the kinetics of transcription initiation, and thus will aid in tuning of synthetic motifs to attain specific characteristics without affecting their protein products.
The potential impact of nanoparticles on the environment and on human health has attracted considerable interest worldwide. The amount of transcriptomics data, in which tissues and cell lines are exposed to nanoparticles, increases year by year. In addition to the importance of the original findings, this data can have value in broader context when combined with other previously acquired and published results. In order to facilitate the efficient usage of the data, we have developed the NanoMiner web resource (http://nanominer.cs.tut.fi/), which contains 404 human transcriptome samples exposed to various types of nanoparticles. All the samples in NanoMiner have been annotated, preprocessed and normalized using standard methods that ensure the quality of the data analyses and enable the users to utilize the database systematically across the different experimental setups and platforms. With NanoMiner it is possible to 1) search and plot the expression profiles of one or several genes of interest, 2) cluster the samples within the datasets, 3) find differentially expressed genes in various nanoparticle studies, 4) detect the nanoparticles causing differential expression of selected genes, 5) analyze enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and Gene Ontology (GO) terms for the detected genes and 6) search the expression values and differential expressions of the genes belonging to a specific KEGG pathway or Gene Ontology. In sum, NanoMiner database is a valuable collection of microarray data which can be also used as a data repository for future analyses.
We explore the effects of probabilistic RNA partitioning during cell division on the normalized variance of RNA numbers across generations of bacterial populations. We first characterize these effects in model cell populations, where gene expression is modeled as a delayed stochastic process, as a function of the synchrony in cell division, the rate of division, and the RNA degradation rate. We further explore the additional variance that arises if the partitioning is biased. Next, in Escherichia coli cells expressing RNA tagged with MS2d-GFP, we measured the normalized variance of RNA numbers across several generations, with cell divisions synchronized by heat shock. We show that synchronized cell populations exhibit transient increases in normalized variance following cell divisions, as predicted by the model, which are not observed in unsynchronized populations. We conclude that errors in partitioning of RNA molecules generate diversity between the offspring of individual bacteria and thus constitute a form of reproductive bet-hedging.
Production and degradation of RNA and proteins are stochastic processes, difficulting the distinction between spurious fluctuations in their numbers and changes in the dynamics of a genetic circuit. An accurate method of change detection is key to analyze plasticity and robustness of stochastic genetic circuits.
In vitro studies show that the transcriptional dynamics in Escherichia coli is sensitive to Mg(2+) concentration in the cell. We study in vivo how Mg(2+) affects the production of RNA molecules under the control of the lar promoter, P(lar), a lac promoter variant. The target RNA codes for RFP followed by 96 MS2d-GFP binding sites, allowing in vivo detection of individual RNA molecules following transcription. As Mg(2+) concentration is increased, transcripts production first increases, but then decreases. Results were confirmed by qPCR and gel assay. Analysis of cell to cell diversity in RNA production shows that the variance of RNA numbers changes with Mg(2+). Gel assay confirms changes in the structure of the target RNA. These results suggest that changes in the dynamics of elongation may also affect RNA production, along with changes in the dynamics of the promoter open complex. The findings suggest that changes in metabolite concentration can have multiple, complex effects on the in vivo dynamics of transcription. Comparative analysis of the effects on the dynamics of transcription of other metabolites confirms the significance of the effects of Mg(2+) ions. Namely, we show that Ca(2+) and Fe(2+) have almost negligible effects in comparison to Mg(2+).
In Escherichia coli the mean and cell-to-cell diversity in RNA numbers of different genes vary widely. This is likely due to different kinetics of transcription initiation, a complex process with multiple rate-limiting steps that affect RNA production.
Patterns of genome-wide methylation vary between tissue types. For example, cancer tissue shows markedly different patterns from those of normal tissue. In this paper we propose a beta-mixture model to describe genome-wide methylation patterns based on probe data from methylation microarrays. The model takes dependencies between neighbour probe pairs into account and assumes three broad categories of methylation, low, medium and high. The model is described by 37 parameters, which reduces the dimensionality of a typical methylation microarray significantly. We used methylation microarray data from 42 colon cancer samples to assess the model.
Neuronal networks exhibit a wide diversity of structures, which contributes to the diversity of the dynamics therein. The presented work applies an information theoretic framework to simultaneously analyze structure and dynamics in neuronal networks. Information diversity within the structure and dynamics of a neuronal network is studied using the normalized compression distance. To describe the structure, a scheme for generating distance-dependent networks with identical in-degree distribution but variable strength of dependence on distance is presented. The resulting network structure classes possess differing path length and clustering coefficient distributions. In parallel, comparable realistic neuronal networks are generated with NETMORPH simulator and similar analysis is done on them. To describe the dynamics, network spike trains are simulated using different network structures and their bursting behaviors are analyzed. For the simulation of the network activity the Izhikevich model of spiking neurons is used together with the Tsodyks model of dynamical synapses. We show that the structure of the simulated neuronal networks affects the spontaneous bursting activity when measured with bursting frequency and a set of intraburst measures: the more locally connected networks produce more and longer bursts than the more random networks. The information diversity of the structure of a network is greatest in the most locally connected networks, smallest in random networks, and somewhere in between in the networks between order and disorder. As for the dynamics, the most locally connected networks and some of the in-between networks produce the most complex intraburst spike trains. The same result also holds for sparser of the two considered network densities in the case of full spike trains.
Gene expression in Escherichia coli is regulated by several mechanisms. We measured in single cells the expression level of a single copy gene coding for green fluorescent protein (GFP), integrated into the genome and driven by a tetracycline inducible promoter, for varying induction strengths. Also, we measured the transcriptional activity of a tetracycline inducible promoter controlling the transcription of a RNA with 96 binding sites for MS2-GFP.
In prokaryotes, transcription and translation are dynamically coupled, as the latter starts before the former is complete. Also, from one transcript, several translation events occur in parallel. To study how events in transcription elongation affect translation elongation and fluctuations in protein levels, we propose a delayed stochastic model of prokaryotic transcription and translation at the nucleotide and codon level that includes the promoter open complex formation and alternative pathways to elongation, namely pausing, arrests, editing, pyrophosphorolysis, RNA polymerase traffic, and premature termination. Stepwise translation can start after the ribosome binding site is formed and accounts for variable codon translation rates, ribosome traffic, back-translocation, drop-off, and trans-translation.
We study the dynamics of a model stochastic two-gene switch at the nucleotide and codon levels. First, we show that its stability, the mean lifetime of the noisy attractors, differs from that of a model where transcription and translation elongation are modeled as single-step delayed events, indicating the need of detailed models to study the dynamics of switches. Next, we vary the coupling between the two genes by varying the affinity of repressor proteins to the promoters and measure the mutual information between the two proteins times series. We find that there is a degree of coupling that maximizes information propagation between the two genes. This is explained by the effects of the coupling on mean and entropy of RNA and protein numbers of each gene, as well as correlation, 2-tuple entropy between the two proteins numbers, and, finally, the stability of the noisy attractors. We also find that increasing the rate of translation initiation increases the correlation between RNA and protein numbers and between the two proteins, due to increased stability of the noisy attractors. Increasing the rate of transcription or decreasing RNA degradation causes opposite effects to the correlation between RNA and proteins of each gene and the stability of the noisy attractors. Finally, we add a sequence-dependent transcription pause site and show that both its probability of occurrence, as well as its mean time length, affects the dynamics of the switch, further demonstrating the dependence of the dynamics of this circuit on sequence level events.
In this editorial we introduce the research paradigms of signal processing in the era of systems biology. Signal processing is a field of science traditionally focused on modeling electronic and communications systems, but recently it has turned to biological applications with astounding results. The essence of signal processing is to describe the natural world by mathematical models and then, based on these models, develop efficient computational tools for solving engineering problems. Here, we underline, with examples, the endless possibilities which arise when the battle-hardened tools of engineering are applied to solve the problems that have tormented cancer researchers. Based on this approach, a new field has emerged, called cancer systems biology. Despite its short history, cancer systems biology has already produced several success stories tackling previously impracticable problems. Perhaps most importantly, it has been accepted as an integral part of the major endeavors of cancer research, such as analyzing the genomic and epigenomic data produced by The Cancer Genome Atlas (TCGA) project. Finally, we show that signal processing and cancer research, two fields that are seemingly distant from each other, have merged into a field that is indeed more than the sum of its parts.
Gene regulatory networks (GRNs) are parallel information processing systems, binding past events to future actions. Since cell types stably remain in restricted subsets of the possible states of the GRN, they are likely the dynamical attractors of the GRN. These attractors differ in which genes are active and in the amount of information propagating within the network. Using mutual information (I) as a measure of information propagation between genes in a GRN, modeled as finite-sized Random Boolean Networks (RBN), we study how the dynamical regime of the GRN affects I within attractors (I(A)). The spectra of I(A) of individual RBNs are found to be scattered and diverse, and distributions of I(A) of ensembles are non-trivial and change shape with mean connectivity. Mean and diversity of I(A) values maximize in the chaotic near-critical regime, whereas ordered near-critical networks are the best at retaining the distinctiveness of each attractors I(A) with noise. The results suggest that selection likely favors near-critical GRNs as these both maximize mean and diversity of I(A), and are the most robust to noise. We find similar I(A) distributions in delayed stochastic models of GRNs. For a particular stochastic GRN, we show that both mean and variance of I(A) have local maxima as its connectivity and noise levels are varied, suggesting that the conclusions for the Boolean network models may be generalizable to more realistic models of GRNs.
Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.
We propose a Markov chain approximation of the delayed stochastic simulation algorithm to infer properties of the mechanisms in prokaryote transcription from the dynamics of RNA levels. We model transcription using the delayed stochastic modelling strategy and realistic parameter values for rate of transcription initiation and RNA degradation. From the model, we generate time series of RNA levels at the single molecule level, from which we use the method to infer the duration of the promoter open complex formation. This is found to be possible even when adding external Gaussian noise to the RNA levels.
A gene networks capacity to process information, so as to bind past events to future actions, depends on its structure and logic. From previous and new microarray measurements in Saccharomyces cerevisiae following gene deletions and overexpressions, we identify a core gene regulatory network (GRN) of functional interactions between 328 genes and the transfer functions of each gene. Inferred connections are verified by gene enrichment.
Several algorithms have been proposed for detecting fluorescently labeled subcellular objects in microscope images. Many of these algorithms have been designed for specific tasks and validated with limited image data. But despite the potential of using extensive comparisons between algorithms to provide useful information to guide method selection and thus more accurate results, relatively few studies have been performed.
Molecular interaction networks establish all cell biological processes. The networks are under intensive research that is facilitated by new high-throughput measurement techniques for the detection, quantification, and characterization of molecules and their physical interactions. For the common model organism yeast Saccharomyces cerevisiae, public databases store a significant part of the accumulated information and, on the way to better understanding of the cellular processes, there is a need to integrate this information into a consistent reconstruction of the molecular interaction network. This work presents and validates RefRec, the most comprehensive molecular interaction network reconstruction currently available for yeast. The reconstruction integrates protein synthesis pathways, a metabolic network, and a protein-protein interaction network from major biological databases. The core of the reconstruction is based on a reference object approach in which genes, transcripts, and proteins are identified using their primary sequences. This enables their unambiguous identification and non-redundant integration. The obtained total number of different molecular species and their connecting interactions is approximately 67,000. In order to demonstrate the capacity of RefRec for functional predictions, it was used for simulating the gene knockout damage propagation in the molecular interaction network in approximately 590,000 experimentally validated mutant strains. Based on the simulation results, a statistical classifier was subsequently able to correctly predict the viability of most of the strains. The results also showed that the usage of different types of molecular species in the reconstruction is important for accurate phenotype prediction. In general, the findings demonstrate the benefits of global reconstructions of molecular interaction networks. With all the molecular species and their physical interactions explicitly modeled, our reconstruction is able to serve as a valuable resource in additional analyses involving objects from multiple molecular -omes. For that purpose, RefRec is freely available in the Systems Biology Markup Language format.
Increased chromosomal instability that alters the gene copy numbers throughout the genome is known to have a role in molecular pathogenesis of tumors. The impact of gene dosage effect to the expression levels of genes in GIST and LMS is unknown. In this paper, we used a combination of array comparative genomic hybridization (aCGH) and gene expression data to gain insights into the interplay of structural and functional changes of the genome in GIST and LMSs. We identified specific target genes that change their expression due to the gene dosage effect. Statistical analysis identified four chromosomal regions, 1p, 14q, 15q, and 22q, where both copy number and mRNA expression were significantly different between the tumor types. Multi-dimensional scaling (MDS) analysis showed that the gene expression profiles of these four regions accurately distinguish GIST and LMS. In addition, the gene dosage sensitive genes in these regions are differently involved in several tumor growth promoting pathways, implying that there are different mechanisms underlying the GIST and LMS carcinogenesis. Integration of aCGH and gene expression data has not only provided insights into how DNA copy number variations affect the gene expression patterns in these cancers, but also proves to be a promising method to choose biologically relevant biomarkers.
In preclinical studies, human adipose stem cells (ASCs) have been shown to have therapeutic applicability, but standard expansion methods for clinical applications remain yet to be established. ASCs are typically expanded in the medium containing fetal bovine serum (FBS). However, sera and other animal-derived culture reagents stage safety issues in clinical therapy, including possible infections and severe immune reactions. By expanding ASCs in the medium containing human serum (HS), the problem can be eliminated. To define how allogeneic HS (alloHS) performs in ASC expansion compared to FBS, a comparative in vitro study in both serum supplements was performed. The choice of serum had a significant effect on ASCs. First, to reach cell proliferation levels comparable with 10% FBS, at least 15% alloHS was required. Second, while genes of the cell cycle pathway were overexpressed in alloHS, genes of the bone morphogenetic protein receptor-mediated signaling on the transforming growth factor beta signaling pathway regulating, for example, osteoblast differentiation, were overexpressed in FBS. The result was further supported by differentiation analysis, where early osteogenic differentiation was significantly enhanced in FBS. The data presented here underscore the importance of thorough investigation of ASCs for utilization in cell therapies. This study is a step forward in the understanding of these potential cells.
Prolonged culture of human embryonic stem cells (hESCs) can lead to adaptation and the acquisition of chromosomal abnormalities, underscoring the need for rigorous genetic analysis of these cells. Here we report the highest-resolution study of hESCs to date using an Affymetrix SNP 6.0 array containing 906,600 probes for single nucleotide polymorphisms (SNPs) and 946,000 probes for copy number variations (CNVs). Analysis of 17 different hESC lines maintained in different laboratories identified 843 CNVs of 50 kb-3 Mb in size. We identified, on average, 24% of the loss of heterozygosity (LOH) sites and 66% of the CNVs changed in culture between early and late passages of the same lines. Thirty percent of the genes detected within CNV sites had altered expression compared to samples with normal copy number states, of which >44% were functionally linked to cancer. Furthermore, LOH of the q arm of chromosome 16, which has not been observed previously in hESCs, was detected.
We study how long pause-prone sites, commonly sequence-dependent, affect transcription and RNA temporal levels in a delayed stochastic model of transcription at the single nucleotide level. We vary pause propensity, duration and the probability of premature termination of elongation at the pause site. We also study the effects of multiple pause sites. We show that pause sites can be used to fine-tune noise strength and burst size distribution of RNA levels. Varying pause rate and duration alone affects bursting but noise is not significantly affected. Noise strength can be changed by varying both parameters and, even more pronouncedly, by varying the probability of premature termination. Adding multiple pause sites amplifies the increase in noise and bursting. This regulatory mechanism of noise and bursting, being evolvable, may partially explain how different genes exhibit a wide spectrum of different behaviors. The results might assist the engineering of genes with a desired degree of noise.
Stochasticity in gene expression affects many cellular processes and is a source of phenotypic diversity between genetically identical individuals. Events in elongation, particularly RNA polymerase pausing, are a source of this noise. Since the rate and duration of pausing are sequence-dependent, this regulatory mechanism of transcriptional dynamics is evolvable. The dependency of pause propensity on regulatory molecules makes pausing a response mechanism to external stress. Using a delayed stochastic model of bacterial transcription at the single nucleotide level that includes the promoter open complex formation, pausing, arrest, misincorporation and editing, pyrophosphorolysis, and premature termination, we investigate how RNA polymerase pausing affects a genes transcriptional dynamics and gene networks. We show that pauses duration and rate of occurrence affect the bursting in RNA production, transcriptional and translational noise, and the transient to reach mean RNA and protein levels. In a genetic repressilator, increasing the pausing rate and the duration of pausing events increases the period length but does not affect the robustness of the periodicity. We conclude that RNA polymerase pausing might be an important evolvable feature of genetic networks.
Little is known about the biological mechanisms that shape the distribution of intervals between the completion of RNA molecules (T(p)RNA) , and thus transcriptional noise. We characterize numerically and analytically how the promoter open complex delay (tau(P)) and the transcription initiation rate (k(t)) shape T(p)RNA. From this, we assess the noise and mean of transcript levels and show that these can be tuned both independently and simultaneously by tau(P) and k(t). Finally, we characterize how tau(P) affects bursting in RNA production and show that the tau(P) measured for a lac promoter best fits independent measurements of the burst distribution of the same promoter. Since tau(P) affects noise in gene expression, and given that it is sequence dependent, it is likely to be evolvable.
An important milestone in revealing cells functions is to build a comprehensive understanding of transcriptional regulation processes. These processes are largely regulated by transcription factors (TFs) binding to DNA sites. Several TF binding site (TFBS) prediction methods have been developed, but they usually model binding of a single TF at a time albeit few methods for predicting binding of multiple TFs also exist. In this article, we propose a probabilistic model that predicts binding of several TFs simultaneously. Our method explicitly models the competitive binding between TFs and uses the prior knowledge of existing protein-protein interactions (PPIs), which mimics the situation in the nucleus. Modeling DNA binding for multiple TFs improves the accuracy of binding site prediction remarkably when compared with other programs and the cases where individual binding prediction results of separate TFs have been combined. The traditional TFBS prediction methods usually predict overwhelming number of false positives. This lack of specificity is overcome remarkably with our competitive binding prediction method. In addition, previously unpredictable binding sites can be detected with the help of PPIs. Source codes are available at http://www.cs.tut.fi/ approximately harrila/.
Fluorescence microscopy is the standard tool for detection and analysis of cellular phenomena. This technique, however, has a number of drawbacks such as the limited number of available fluorescent channels in microscopes, overlapping excitation and emission spectra of the stains, and phototoxicity.
Gene regulatory networks (GRNs) are stochastic, thus, do not have attractors, but can remain in confined regions of the state space, i.e. the noisy attractors, which define the cell type and phenotype.
Cluster analysis has become a standard computational method for gene function discovery as well as for more general explanatory data analysis. A number of different approaches have been proposed for that purpose, out of which different mixture models provide a principled probabilistic framework. Cluster analysis is increasingly often supplemented with multiple data sources nowadays, and these heterogeneous information sources should be made as efficient use of as possible.
We present a delayed stochastic model of transcription at the single nucleotide level. The model accounts for the promoter open complex formation and includes alternative pathways to elongation, namely pausing, arrest, misincorporation and editing, pyrophosphorolysis, and premature termination. We confront the dynamics of this detailed model with a single-step multi-delayed stochastic model and with measurements of expression of a repressed gene at the single molecule level. At low expression rates both models match the experiments but, at higher rates the two models differ significantly, with consequences to cell-to-cell phenotypic variability. The alternative pathway reactions, due to, for example, causing polymerases to collide more often on the template, are the cause for the difference in dynamical behaviors. Next, we confront the model with measurements of the transcriptional dynamics at the single RNA level of an induced gene and show that RNA production, besides its bursting dynamics, also exhibits pulses (2 or more RNAs produced in intervals smaller than the smallest interval between initiations). The distribution of occurrences and amplitudes of pulses match the experimental measurements. This pulsing and the noise at the elongation stage are shown to play a role in the dynamics of a genetic switch.
We investigate how the regulation of protein multi-functionalities affect the dynamics of a stochastic model of a toggle switch and the differentiation pattern of cell population regulated by the switch. We study the effects of loss of functionality in DNA-binding and repression and the involvement in differentiation pathway choice. First is shown how the patterns of cell differentiation differ, when each of these functionalities is fully non-functional. Next, tuning the fraction of non-functional proteins regarding the ability to bind DNA is shown to allow fine tuning of the switch and cell differentiation pattern dynamics. Finally, biasing the probability of functionality of the two proteins biases the dynamics of the switch and cell differentiation patterns, especially when transcription factors retain the ability to bind DNA but have lost the ability to repress gene expression. Our results suggest that, besides transcriptional and translational levels of regulation, activation of functionalities in multi-functional proteins are an important regulator of gene networks.
We study the plasticity of a delayed stochastic model of a genetic toggle switch as a multipotent differentiation pathway switch, at the single cell and cell population levels, by observing distributions of differentiation pathways choices of genetically homogeneous cell populations. Assuming a model of stochastic pathway determination of cell differentiation that is regulated by the proteins of the switch, we vary the proteins expression level and degradation rates, which cells are known to be able to regulate, to vary mean level, noise, and bias of the proteins expression levels. It is shown that small changes in each of these dynamical features significantly and distinctively affects the dynamics of the switch at the single cell level and thus, the cell differentiation patterns. The regulation of these features allows cells to regulate their pluripotency and cell populations distribution of lineage choice, suggesting that the stochastic switch has high plasticity regarding differentiation pathway choice regulation, thus providing adaptability to environmental stresses and changes.
The kinetics of transcription initiation in Escherichia coli depend on the duration of two rate-limiting steps, the closed and the open complex formation. In a lac promoter variant, P(lac/ara-1), the kinetics of these steps is controlled by IPTG and arabinose. From in vivo single-RNA measurements, we find that induction affects the mean and normalized variance of the intervals between consecutive RNA productions. Transcript production is sub-Poissonian in all conditions tested. The kinetics of each step is independently controlled by a different inducer. We conclude that the regulatory mechanism of P(lac/ara-1) allows the stochasticity of gene expression to be environment-dependent.
In prokaryotes, the rate at which codons are translated varies from one codon to the next. Using a stochastic model of transcription and translation at the nucleotide and codon levels, we investigate the effects of the codon sequence on the dynamics of protein numbers. For sequences generated according to the codon frequencies in Escherichia coli, we find that mean protein numbers at near equilibrium differ with the codon sequence, due to the mean codon translation efficiencies, in particular of the codons at the ribosome binding site region. We find close agreement between these predictions and measurements of protein expression levels as a function of the codon sequence. Next, we investigate the effects of short codon sequences at the start/end of the RNA sequence with linearly increasing/decreasing translation efficiencies, known as slow ramps. The ramps affect the mean, but not the fluctuations, in proteins numbers by affecting the rate of translation initiation. Finally, we show that slow ramps affect the dynamics of small genetic circuits, namely, switches and clocks. In switches, ramps affect the frequency of switching and bias the robustness of the noisy attractors. In repressilators, ramps alter the robustness of periodicity. We conclude that codon sequences affect the dynamics of gene expression and genetic circuits and, thus, are likely to be under selection regarding both mean codon frequency as well as spatial arrangement along the sequence.
In Escherichia coli, tetracycline prevents translation. When subject to tetracycline, E. coli express TetA to pump it out by a mechanism that is sensitive, while fairly independent of cellular metabolism. We constructed a target gene, PtetA-mRFP1-96BS, with a 96 MS2-GFP binding site array in a single-copy BAC vector, whose expression is controlled by the tetA promoter. We measured the in vivo kinetics of production of individual RNA molecules of the target gene as a function of inducer concentration and temperature. From the distributions of intervals between transcription events, we find that RNA production by PtetA is a sub-Poissonian process. Next, we infer the number and duration of the prominent sequential steps in transcription initiation by maximum likelihood estimation. Under full induction and at optimal temperature, we observe three major steps. We find that the kinetics of RNA production under the control of PtetA, including number and duration of the steps, varies with induction strength and temperature. The results are supported by a set of logical pairwise Kolmogorov-Smirnov tests. We conclude that the expression of TetA is controlled by a sequential mechanism that is robust, whereas sensitive to external signals.
Many pairs of genes in Escherichia coli are driven by closely spaced promoters. We study the dynamics of expression of such pairs of genes driven by a model at the molecule and nucleotide level with delayed stochastic dynamics as a function of the binding affinity of the RNA polymerase to the promoter region, of the geometry of the promoter, of the distance between transcription start sites (TSSs) and of the repression mechanism. We find that the rate limiting steps of transcription at the TSS, the closed and open complex formations, strongly affect the kinetics of RNA production for all promoter configurations. Beyond a certain rate of transcription initiation events, we find that the interference between polymerases correlates the dynamics of production of the two RNA molecules from the two TSS and affects the distribution of intervals between consecutive productions of RNA molecules. The degree of correlation depends on the geometry, the distance between TSSs and repressors. Small changes in the distance between TSSs can cause abrupt changes in behavior patterns, suggesting that the sequence between adjacent promoters may be subject to strong selective pressure. The results provide better understanding on the sequence level mechanisms of transcription regulation in bacteria and may aid in the genetic engineering of artificial circuits based on closely spaced promoters.
Escherichia coli cells employ an asymmetric strategy at division, segregating unwanted substances to older poles, which has been associated with aging in these organisms. The kinetics of this process is still poorly understood. Using the MS2 coat protein fused to green fluorescent protein (GFP) and a reporter construct with multiple MS2 binding sites, we tracked individual RNA-MS2-GFP complexes in E. coli cells from the time when they were produced. Analyses of the kinetics and brightness of the spots showed that these spots appear in the midcell region, are composed of a single RNA-MS2-GFP complex, and reach a pole before another target RNA is formed, typically remaining there thereafter. The choice of pole is probabilistic and heavily biased toward one pole, similar to what was observed by previous studies regarding protein aggregates. Additionally, this mechanism was found to act independently on each disposed molecule. Finally, while the RNA-MS2-GFP complexes were disposed of, the MS2-GFP tagging molecules alone were not. We conclude that this asymmetric mechanism to segregate damage at the expense of aging individuals acts probabilistically on individual molecules and is capable of the accurate classification of molecules for disposal.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.