Differential microRNA (miRNA) expression profiling by high-throughput methods has generated a vast amount of information about the complex role of these small regulatory molecules in a broad spectrum of human diseases. However, the results of such studies are often inconsistent, mostly due to the lack of cross-platform standardization, ongoing discovery of novel miRNAs, and small sample size. Therefore, a critical and systematic analysis of all available information is essential for successful identification of the most relevant miRNAs. Meta-analysis approach allows integrating the results from several independent studies in order to achieve greater statistical power and estimate the variability between the studies. Here we describe as an example the use of a robust rank aggregation (RRA) method for identification of miRNA meta-signature in lung cancer. This method analyzes prioritized gene lists and finds commonly overlapping genes, which are ranked consistently better than expected by chance. An RRA approach not only helps to prioritize the putative targets for further experimental studies but also highlights the challenges related with the development of miRNA-based tests and emphasizes the need for rigorous evaluation of the results before proceeding to clinical trials.
The advent of genome-wide RNA interference (RNAi)-based screens puts us in the position to identify genes for all functions human cells carry out. However, for many functions, assay complexity and cost make genome-scale knockdown experiments impossible. Methods to predict genes required for cell functions are therefore needed to focus RNAi screens from the whole genome on the most likely candidates. Although different bioinformatics tools for gene function prediction exist, they lack experimental validation and are therefore rarely used by experimentalists. To address this, we developed an effective computational gene selection strategy that represents public data about genes as graphs and then analyzes these graphs using kernels on graph nodes to predict functional relationships. To demonstrate its performance, we predicted human genes required for a poorly understood cellular function-mitotic chromosome condensation-and experimentally validated the top 100 candidates with a focused RNAi screen by automated microscopy. Quantitative analysis of the images demonstrated that the candidates were indeed strongly enriched in condensation genes, including the discovery of several new factors. By combining bioinformatics prediction with experimental validation, our study shows that kernels on graph nodes are powerful tools to integrate public biological data and predict genes involved in cellular functions of interest.
DNA epigenetic modifications, such as methylation, are important regulators of tissue differentiation, contributing to processes of both development and cancer. Profiling the tissue-specific DNA methylome patterns will provide novel insights into normal and pathogenic mechanisms, as well as help in future epigenetic therapies. In this study, 17 somatic tissues from four autopsied humans were subjected to functional genome analysis using the Illumina Infinium HumanMethylation450 BeadChip, covering 486 428 CpG sites.
Regardless of the advent of high-throughput sequencing, microarrays remain central in current biomedical research. Conventional microarray analysis pipelines apply data reduction before the estimation of differential expression, which is likely to render the estimates susceptible to noise from signal summarization and reduce statistical power. We present a probe-level framework, which capitalizes on the high number of concurrent measurements to provide more robust differential expression estimates. The framework naturally extends to various experimental designs and target categories (e.g. transcripts, genes, genomic regions) as well as small sample sizes. Benchmarking in relation to popular microarray and RNA-sequencing data-analysis pipelines indicated high and stable performance on the Microarray Quality Control dataset and in a cell-culture model of hypoxia. Experimental-data-exhibiting long-range epigenetic silencing of gene expression was used to demonstrate the efficacy of detecting differential expression of genomic regions, a level of analysis not embraced by conventional workflows. Finally, we designed and conducted an experiment to identify hypothermia-responsive genes in terms of monotonic time-response. As a novel insight, hypothermia-dependent up-regulation of multiple genes of two major antioxidant pathways was identified and verified by quantitative real-time PCR.
Biological data acquisition is raising new challenges, both in data analysis and handling. Not only is it proving hard to analyze the data at the rate it is generated today, but simply reading and transferring data files can be prohibitively slow due to their size. This primarily concerns logistics within and between data centers, but is also important for workstation users in the analysis phase. Common usage patterns, such as comparing and transferring files, are proving computationally expensive and are tying down shared resources.
Increased availability of various genotyping techniques has initiated a race for finding genetic markers that can be used in diagnostics and personalized medicine. Although many genetic risk factors are known, key causes of common diseases with complex heritage patterns are still unknown. Identification of such complex traits requires a targeted study over a large collection of data. Ideally, such studies bring together data from many biobanks. However, data aggregation on such a large scale raises many privacy issues.
Pluripotency in human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) is regulated by three transcription factors-OCT3/4, SOX2, and NANOG. To fully exploit the therapeutic potential of these cells it is essential to have a good mechanistic understanding of the maintenance of self-renewal and pluripotency. In this study, we demonstrate a powerful systems biology approach in which we first expand literature-based network encompassing the core regulators of pluripotency by assessing the behavior of genes targeted by perturbation experiments. We focused our attention on highly regulated genes encoding cell surface and secreted proteins as these can be more easily manipulated by the use of inhibitors or recombinant proteins. Qualitative modeling based on combining boolean networks and in silico perturbation experiments were employed to identify novel pluripotency-regulating genes. We validated Interleukin-11 (IL-11) and demonstrate that this cytokine is a novel pluripotency-associated factor capable of supporting self-renewal in the absence of exogenously added bFGF in culture. To date, the various protocols for hESCs maintenance require supplementation with bFGF to activate the Activin/Nodal branch of the TGF? signaling pathway. Additional evidence supporting our findings is that IL-11 belongs to the same protein family as LIF, which is known to be necessary for maintaining pluripotency in mouse but not in human ESCs. These cytokines operate through the same gp130 receptor which interacts with Janus kinases. Our finding might explain why mESCs are in a more naïve cell state compared to hESCs and how to convert primed hESCs back to the naïve state. Taken together, our integrative modeling approach has identified novel genes as putative candidates to be incorporated into the expansion of the current gene regulatory network responsible for inducing and maintaining pluripotency.
A prerequisite for successful embryo implantation is adequate preparation of receptive endometrium and the establishment and maintenance of a viable embryo. The success of implantation further relies upon a two-way dialogue between the embryo and uterus. However, molecular bases of these preimplantation and implantation processes in humans are not well known. We performed genome expression analyses of human embryos (n = 128) and human endometria (n = 8). We integrated these data with protein-protein interactions in order to identify molecular networks within the endometrium and the embryo, and potential embryo-endometrium interactions at the time of implantation. For that, we applied a novel network profiling algorithm HyperModules, which combines topological module identification and functional enrichment analysis. We found a major wave of transcriptional down-regulation in preimplantation embryos. In receptive-stage endometrium, several genes and signaling pathways were identified, including JAK-STAT signaling and inflammatory pathways. The main curated embryo-endometrium interaction network highlighted the importance of cell adhesion molecules in the implantation process. We also identified cytokine-cytokine receptor interactions involved in implantation, where osteopontin (SPP1), leukemia inhibitory factor (LIF) and leptin (LEP) pathways were intertwining. Further, we identified a number of novel players in human embryo-endometrium interactions, such as apolipoprotein D (APOD), endothelin 1 (END1), fibroblast growth factor 7 (FGF7), gastrin (GAST), kringle containing trnasmembrane protein 1 (KREMEN1), neuropilin 1 (NRP1), serpin peptidase inhibitor clade A member 3 (SERPINA3), versican (VCAN), and others. Our findings provide a fundamental resource for better understanding of the genetic network that leads to successful embryo implantation. We demonstrate the first systems biology approach into the complex molecular network of the implantation process in humans.
Lung cancer is one of the deadliest types of cancer proven by the poor survival and high relapse rates after surgery. Recently discovered microRNAs (miRNAs), small noncoding RNA molecules, play a crucial role in modulating gene expression networks and are directly involved in the progression of a number of human cancers. In this study, we analyzed the expression profile of 858 miRNAs in 38 Estonian nonsmall cell lung cancer (NSCLC) samples (Stage I and II) and 27 adjacent nontumorous tissue samples using Illumina miRNA arrays. We found that 39 miRNAs were up-regulated and 33 down-regulated significantly in tumors compared with normal lung tissue. We observed aberrant expression of several well-characterized tumorigenesis-related miRNAs, as well as a number of miRNAs whose function is currently unknown. We show that low expression of miR-374a in early-stage NSCLC is associated with poor patient survival. The combinatorial effect of the up- and down-regulated miRNAs is predicted to most significantly affect pathways associated with cell migration, differentiation and growth, and several signaling pathways that contribute to tumorigenesis. In conclusion, our results demonstrate that expression of miR-374a at early stages of NSCLC progression can serve as a prognostic marker for patient risk stratification and may be a promising therapeutic target for the treatment of lung cancer.
Dendritic cells (DCs) and macrophages (MFs) are important multifunctional immune cells. Like other cell types, they express hundreds of different microRNAs (miRNAs) that are recently discovered post-transcriptional regulators of gene expression. Here we present updated miRNA expression profiles of monocytes, DCs and MFs. Compared with monocytes, ?50 miRNAs were found to be differentially expressed in immature and mature DCs or MFs, with major expression changes occurring during the differentiation. Knockdown of DICER1, a protein needed for miRNA biosynthesis, led to lower DC-specific intercellular adhesion molecule-3-grabbing non-integrin (DC-SIGN) and enhanced CD14 protein levels, confirming the importance of miRNAs in DC differentiation in general. Inhibition of the two most highly up-regulated miRNAs, miR-511 and miR-99b, also resulted in reduced DC-SIGN level. Prediction of miRNA-511 targets revealed a number of genes with known immune functions, of which TLR4 and CD80 were validated using inhibition of miR-511 in DCs and luciferase assays in HEK293 cells. Interestingly, under the cell cycle arrest conditions, miR-511 seems to function as a positive regulator of TLR4. In conclusion, we have identified miR-511 as a novel potent modulator of human immune response. In addition, our data highlight that miRNA influence on gene expression is dependent on the cellular environment.
Functional interpretation of candidate gene lists is an essential task in modern biomedical research. Here, we present the 2011 update of g:Profiler (http://biit.cs.ut.ee/gprofiler/), a popular collection of web tools for functional analysis. g:GOSt and g:Cocoa combine comprehensive methods for interpreting gene lists, ordered lists and list collections in the context of biomedical ontologies, pathways, transcription factor and microRNA regulatory motifs and protein-protein interactions. Additional tools, namely the biomolecule ID mapping service (g:Convert), gene expression similarity searcher (g:Sorter) and gene homology searcher (g:Orth) provide numerous ways for further analysis and interpretation. In this update, we have implemented several features of interest to the community: (i) functional analysis of single nucleotide polymorphisms and other DNA polymorphisms is supported by chromosomal queries; (ii) network analysis identifies enriched protein-protein interaction modules in gene lists; (iii) functional analysis covers human disease genes; and (iv) improved statistics and filtering provide more concise results. g:Profiler is a regularly updated resource that is available for a wide range of species, including mammals, plants, fungi and insects.
Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.
DNA replication origins are licensed in early G(1) phase of the cell cycle where the origin recognition complex (ORC) recruits the minichromosome maintenance (MCM) helicase to origins. These pre-replicative complexes (pre-RCs) remain inactive until replication is initiated in the S phase. However, transcriptional activity in the regions of origins can eliminate their functionality by displacing the components of pre-RC from DNA. We analyzed genome-wide data of mRNA and cryptic unstable transcripts in the context of locations of replication origins in yeast genome and found that at least one-third of the origins are transcribed and therefore might be inactivated by transcription. When investigating the fate of transcriptionally inactivated origins, we found that replication origins were repetitively licensed in G(1) to reestablish their functionality after transcription. We propose that reloading of pre-RC components in G(1) might be utilized for the maintenance of sufficient number of competent origins for efficient initiation of DNA replication in S phase.
Despite the well-defined histological types of non-small cell lung cancer (NSCLC), a given stage is often associated with wide-ranging survival rates and treatment outcomes. This disparity has led to an increased demand for the discovery and identification of new informative biomarkers.
A 64-year-old male patient was diagnosed with 3 consecutive non-small cell lung carcinomas (NSCLC). In the current study, we applied whole-genome gene expression analysis to control, primary and locally recurrent cancer, and supposed metastasis samples of a single patient. According to our knowledge, there are no published papers describing the gene expression profiles of a single patients squamous cell lung cancers. As the histology and differentiation grade of the primary cancer and the supposed metastasis differed minimally, but local recurrence was poorly differentiated, molecular profiling of the samples was carried out in order to confirm or reject the hypothesis of second primary cancer. Principal component analysis of the gene expression data revealed distinction of the local recurrence. Gene ontology analysis showed no molecular characteristics of metastasis in the supposed metastasis. Gene expression analysis is valuable and can be supportive in decision-making of diagnostically complicated cancer cases.
Monocyte-derived macrophages and dendritic cells (DCs) are important in inflammatory processes and are often used for immunotherapeutic approaches. Blood monocytes can be differentiated into macrophages and DCs, which is accompanied with transcriptional changes in many genes, including chemokines and cell surface markers.
The current epidemic of obesity has caused a surge of interest in the study of adipose tissue formation. While major progress has been made in defining the molecular networks that control adipocyte terminal differentiation, the early steps of adipocyte development and the embryonic origin of this lineage remain largely unknown.
It is essential to understand the network of transcription factors controlling self-renewal of human embryonic stem cells (ESCs) and human embryonal carcinoma cells (ECs) if we are to exploit these cells in regenerative medicine regimes. Correlating gene expression levels after RNAi-based ablation of OCT4 function with its downstream targets enables a better prediction of motif-specific driven expression modules pertinent for self-renewal and differentiation of embryonic stem cells and induced pluripotent stem cells.We initially identified putative direct downstream targets of OCT4 by employing CHIP-on-chip analysis. A comparison of three peak analysis programs revealed a refined list of OCT4 targets in the human EC cell line NCCIT, this list was then compared to previously published OCT4 CHIP-on-chip datasets derived from both ES and EC cells. We have verified an enriched POU-motif, discovered by a de novo approach, thus enabling us to define six distinct modules of OCT4 binding and regulation of its target genes.A selection of these targets has been validated, like NANOG, which harbours the evolutionarily conserved OCT4-SOX2 binding motif within its proximal promoter. Other validated targets, which do not harbour the classical HMG motif are USP44 and GADD45G, a key regulator of the cell cycle. Over-expression of GADD45G in NCCIT cells resulted in an enrichment and up-regulation of genes associated with the cell cycle (CDKN1B, CDKN1C, CDK6 and MAPK4) and developmental processes (BMP4, HAND1, EOMES, ID2, GATA4, GATA5, ISL1 and MSX1). A comparison of positively regulated OCT4 targets common to EC and ES cells identified genes such as NANOG, PHC1, USP44, SOX2, PHF17 and OCT4, thus further confirming their universal role in maintaining self-renewal in both cell types. Finally we have created a user-friendly database (http://biit.cs.ut.ee/escd/), integrating all OCT4 and stem cell related datasets in both human and mouse ES and EC cells.In the current era of systems biology driven research, we envisage that our integrated embryonic stem cell database will prove beneficial to the booming field of ES, iPS and cancer research.
Transcription factor (TF) perturbation experiments give valuable insights into gene regulation. Genome-scale evidence from microarray measurements may be used to identify regulatory interactions between TFs and targets. Recently, Hu and colleagues published a comprehensive study covering 269 TF knockout mutants for the yeast Saccharomyces cerevisiae. However, the information that can be extracted from this valuable dataset is limited by the method employed to process the microarray data. Here, we present a reanalysis of the original data using improved statistical techniques freely available from the BioConductor project. We identify over 100,000 differentially expressed genes-nine times the total reported by Hu et al. We validate the biological significance of these genes by assessing their functions, the occurrence of upstream TF-binding sites, and the prevalence of protein-protein interactions. The reanalysed dataset outperforms the original across all measures, indicating that we have uncovered a vastly expanded list of relevant targets. In summary, this work presents a high-quality reanalysis that maximizes the information contained in the Hu et al. compendium. The dataset is available from ArrayExpress (accession: E-MTAB-109) and it will be invaluable to any scientist interested in the yeast transcriptional regulatory system.
Brachyury(+) mesodermal cell population with purity over 79% was obtained from differentiating brachyury embryonic stem cells (ESC) generated with brachyury promoter driven enhanced green fluorescent protein and puromycin-N-acetyltransferase. A comprehensive transcriptomic analysis of brachyury(+) cells enriched with puromycin application from 6-day-old embryoid bodies (EBs), 6-day-old control EBs and undifferentiated ESCs led to identification of 1573 uniquely up-regulated and 1549 uniquely down-regulated transcripts in brachyury(+) cells. Furthermore, transcripts up-regulated in brachyury(+) cells have overrepresented the Gene Ontology annotations (cell differentiation, blood vessel morphogenesis, striated muscle development, placenta development and cell motility) and Kyoto Encyclopedia of Genes and Genomes pathway annotations (mitogen-activated protein kinase signaling and transforming growth factor beta signaling). Transcripts representing Larp2 and Ankrd34b are notably up-regulated in brachyury(+) cells. Knockdown of Larp2 resulted in a significantly down-regulation BMP-2 expression, and knockdown of Ankrd34b resulted in alteration of NF-H, PPARgamma and PECAM1 expression. The elucidation of transcriptomic signatures of ESCs-derived brachyury(+) cells will contribute toward defining the genetic and cellular identities of presumptive mesodermal cells. Furthermore, there is a possible involvement of Larp2 in the regulation of the late mesodermal marker BMP-2. Ankrd34b might be a positive regulator of neurogenesis and a negative regulator of adipogenesis.
We present a web resource MEM (Multi-Experiment Matrix) for gene expression similarity searches across many datasets. MEM features large collections of microarray datasets and utilizes rank aggregation to merge information from different datasets into a single global ordering with simultaneous statistical significance estimation. Unique features of MEM include automatic detection, characterization and visualization of datasets that includes the strongest coexpression patterns. MEM is freely available at http://biit.cs.ut.ee/mem/.
The Alternative Splicing and Transcript Diversity database (ASTD) gives access to a vast collection of alternative transcripts that integrate transcription initiation, polyadenylation and splicing variant data. Alternative transcripts are derived from the mapping of transcribed sequences to the complete human, mouse and rat genomes using an extension of the computational pipeline developed for the ASD (Alternative Splicing Database) and ATD (Alternative Transcript Diversity) databases, which are now superseded by ASTD. For the human genome, ASTD identifies splicing variants, transcription initiation variants and polyadenylation variants in 68%, 68% and 62% of the gene set, respectively, consistent with current estimates for transcription variation. Users can access ASTD through a variety of browsing and query tools, including expression state-based queries for the identification of tissue-specific isoforms. Participating laboratories have experimentally validated a subset of ASTD-predicted alternative splice forms and alternative polyadenylation forms that were not previously reported. The ASTD database can be accessed at http://www.ebi.ac.uk/astd.
Measuring gene expression levels with microarrays is one of the key technologies of modern genomics. Clustering of microarray data is an important application, as genes with similar expression profiles may be regulated by common pathways and involved in related functions. Gene Ontology (GO) analysis and visualization allows researchers to study the biological context of discovered clusters and characterize genes with previously unknown functions. We present VisHiC (Visualization of Hierarchical Clustering), a web server for clustering and compact visualization of gene expression data combined with automated function enrichment analysis. The main output of the analysis is a dendrogram and visual heatmap of the expression matrix that highlights biologically relevant clusters based on enriched GO terms, pathways and regulatory motifs. Clusters with most significant enrichments are contracted in the final visualization, while less relevant parts are hidden altogether. Such a dense representation of microarray data gives a quick global overview of thousands of transcripts in many conditions and provides a good starting point for further analysis. VisHiC is freely available at http://biit.cs.ut.ee/vishic.
Phosphoinositide 3-kinase (PI3K)-dependent signaling has been implicated in the regulation of embryonic stem (ES) cell fate. To gain further insight into the mechanisms regulated by PI3Ks in murine ES cells, we have performed expression profiling using Affymetrix GeneChips to characterize the transcriptional changes that arise as a result of inhibition of PI3K-dependent signaling. Using filtering of greater than 1.5-fold change in expression and an analysis of variance significance level of p < .05, we have defined a dataset comprising 646 probe sets that detect changes in transcript expression (469 down and 177 up) on inhibition of PI3Ks. Changes in expression of selected genes have been validated by quantitative reverse transcription polymerase chain reaction. Gene ontology analyses reveal significant over-representation of transcriptional regulators within our dataset. In addition, several known regulators of ES cell pluripotency, for example, Nanog, Esrrb, Tbx3, and Tcl-1, are among the downregulated genes. To evaluate the functional involvement of selected genes in regulation of ES cell self-renewal, we have used short interfering RNA-mediated knockdown. These studies identify genes not previously associated with control of ES cell fate that are involved in regulating ES cell pluripotency, including the protein tyrosine phosphatase Shp-1 and the Zscan4 family of zinc finger proteins. Further gain-of-function analyses demonstrate the importance of Zscan4c in regulation of ES cell pluripotency.
Cellular processes are often carried out by intricate systems of interacting genes and proteins. Some of these systems are rather well studied and described in pathway databases, while the roles and functions of the majority of genes are poorly understood. A large compendium of public microarray data is available that covers a variety of conditions, samples, and tissues and provides a rich source for genome-scale information. We focus our study on the analysis of 35 curated biological pathways in the context of gene co-expression over a large variety of biological conditions. By defining a global co-expression similarity rank for each gene and pathway, we perform exhaustive leave-one-out computations to describe existing pathway memberships using other members of the corresponding pathway as reference. We demonstrate that while successful in recovering biological base processes such as metabolism and translation, the global correlation measure fails to detect gene memberships in signaling pathways where co-expression is less evident. Our results also show that pathway membership detection is more effective when using only a subset of corresponding pathway members as reference, supporting the existence of more tightly co-expressed subsets of genes within pathways. Our study assesses the predictive power of global gene expression correlation measures in reconstructing biological systems of various functions and specificity. The developed computational network has immediate applications in detecting dubious pathway members and predicting novel member candidates.
Embryonic stem (ES) cells have high self-renewal capacity and the potential to differentiate into a large variety of cell types. To investigate gene networks operating in pluripotent ES cells and their derivatives, the "Functional Genomics in Embryonic Stem Cells" consortium (FunGenES) has analyzed the transcriptome of mouse ES cells in eleven diverse settings representing sixty-seven experimental conditions. To better illustrate gene expression profiles in mouse ES cells, we have organized the results in an interactive database with a number of features and tools. Specifically, we have generated clusters of transcripts that behave the same way under the entire spectrum of the sixty-seven experimental conditions; we have assembled genes in groups according to their time of expression during successive days of ES cell differentiation; we have included expression profiles of specific gene classes such as transcription regulatory factors and Expressed Sequence Tags; transcripts have been arranged in "Expression Waves" and juxtaposed to genes with opposite or complementary expression patterns; we have designed search engines to display the expression profile of any transcript during ES cell differentiation; gene expression data have been organized in animated graphs of KEGG signaling and metabolic pathways; and finally, we have incorporated advanced functional annotations for individual genes or gene clusters of interest and links to microarray and genomic resources. The FunGenES database provides a comprehensive resource for studies into the biology of ES cells.
Mouse embryonic stem (ES) cells remain pluripotent in vitro when grown in the presence of the cytokine Leukaemia Inhibitory Factor (LIF). Identification of LIF targets and of genes regulating the transition between pluripotent and early differentiated cells is a critical step for understanding the control of ES cell pluripotency.
The prognostic and diagnostic value of microRNA (miRNA) expression aberrations in lung cancer has been studied intensely in recent years. However, due to the application of different technological platforms and small sample size, the miRNA expression profiling efforts have led to inconsistent results between the studies. We performed a comprehensive meta-analysis of 20 published miRNA expression studies in lung cancer, including a total of 598 tumor and 528 non-cancerous control samples. Using a recently published robust rank aggregation method, we identified a statistically significant miRNA meta-signature of seven upregulated (miR-21, miR-210, miR-182, miR-31, miR-200b, miR-205 and miR-183) and eight downregulated (miR-126-3p, miR-30a, miR-30d, miR-486-5p, miR-451a, miR-126-5p, miR-143 and miR-145) miRNAs. We conducted a gene set enrichment analysis to identify pathways that are most strongly affected by altered expression of these miRNAs. We found that meta-signature miRNAs cooperatively target functionally related and biologically relevant genes in signaling and developmental pathways. We have shown that such meta-analysis approach is suitable and effective solution for identification of statistically significant miRNA meta-signature by combining several miRNA expression studies. This method allows the analysis of data produced by different technological platforms that cannot be otherwise directly compared or in the case when raw data are unavailable.
Developmental neurotoxicity (DNT) and many forms of reproductive toxicity (RT) often manifest themselves in functional deficits that are not necessarily based on cell death, but rather on minor changes relating to cell differentiation or communication. The fields of DNT/RT would greatly benefit from in vitro tests that allow the identification of toxicant-induced changes of the cellular proteostasis, or of its underlying transcriptome network. Therefore, the human embryonic stem cell (hESC)-derived novel alternative test systems (ESNATS) European commission research project established RT tests based on defined differentiation protocols of hESC and their progeny. Valproic acid (VPA) and methylmercury (MeHg) were used as positive control compounds to address the following fundamental questions: (1) Does transcriptome analysis allow discrimination of the two compounds? (2) How does analysis of enriched transcription factor binding sites (TFBS) and of individual probe sets (PS) distinguish between test systems? (3) Can batch effects be controlled? (4) How many DNA microarrays are needed? (5) Is the highest non-cytotoxic concentration optimal and relevant for the study of transcriptome changes? VPA triggered vast transcriptional changes, whereas MeHg altered fewer transcripts. To attenuate batch effects, analysis has been focused on the 500 PS with highest variability. The test systems differed significantly in their responses (<20 % overlap). Moreover, within one test system, little overlap between the PS changed by the two compounds has been observed. However, using TFBS enrichment, a relatively large common response to VPA and MeHg could be distinguished from compound-specific responses. In conclusion, the ESNATS assay battery allows classification of human DNT/RT toxicants on the basis of their transcriptome profiles.
We developed m:Explorer for identifying process-specific transcription factors (TFs) from multiple genome-wide sources, including transcriptome, DNA-binding and chromatin data. m:Explorer robustly outperforms similar techniques in finding cell cycle TFs in Saccharomyces cerevisiae. We predicted and experimentally tested regulators of quiescence (G0), a model of ageing, over a six-week time-course. We validated nine of top-12 predictions as novel G0 TFs, including ?mga2, ?cst6, ?bas1 with higher viability and G0-essential TFs Tup1, Swi3. Pathway analysis associates longevity to reduced growth, reprogrammed metabolism and cell wall remodeling. m:Explorer (http://biit.cs.ut.ee/mexplorer/) is instrumental in interrogating eukaryotic regulatory systems using heterogeneous data.
The Autoimmune Regulator (AIRE) is a regulator of transcription in the thymic medulla, where it controls the expression of a large set of peripheral-tissue specific genes. AIRE interacts with the transcriptional coactivator and acetyltransferase CBP and synergistically cooperates with it in transcriptional activation. Here, we aimed to study a possible role of AIRE acetylation in the modulation of its activity. We found that AIRE is acetylated in tissue culture cells and this acetylation is enhanced by overexpression of CBP and the CBP paralog p300. The acetylated lysines were located within nuclear localization signal and SAND domain. AIRE with mutations that mimicked acetylated K243 and K253 in the SAND domain had reduced transactivation activity and accumulated into fewer and larger nuclear bodies, whereas mutations that mimicked the unacetylated lysines were functionally similar to wild-type AIRE. Analogously to CBP, p300 localized to AIRE-containing nuclear bodies, however, the overexpression of p300 did not enhance the transcriptional activation of AIRE-regulated genes. Further studies showed that overexpression of p300 stabilized the AIRE protein. Interestingly, gene expression profiling revealed that AIRE, with mutations mimicking K243/K253 acetylation in SAND, was able to activate gene expression, although the affected genes were different and the activation level was lower from those regulated by wild-type AIRE. Our results suggest that the AIRE acetylation can influence the selection of AIRE activated genes.
Investigating the molecular mechanisms controlling the in vivo developmental program postembryogenesis is challenging and time consuming. However, the developmental program can be partly recapitulated in vitro by the use of cultured embryonic stem cells (ESCs). Similar to the totipotent cells of the inner cell mass, gene expression and morphological changes in cultured ESCs occur hierarchically during their differentiation, with epiblast cells developing first, followed by germ layers and finally somatic cells. Combination of high throughput -omics technologies with murine ESCs offers an alternative approach for studying developmental processes toward organ-specific cell phenotypes. We have made an attempt to understand differentiation networks controlling embryogenesis in vivo using a time kinetic, by identifying molecules defining fundamental biological processes in the pluripotent state as well as in early and the late differentiation stages of ESCs. Our microarray data of the differentiation of the ESCs clearly demonstrate that the most critical early differentiation processes occur at days 2 and 3 of differentiation. Besides monitoring well-annotated markers pertinent to both self-renewal and potency (capacity to differentiate to different cell lineage), we have identified candidate molecules for relevant signaling pathways. These molecules can be further investigated in gain and loss-of-function studies to elucidate their role for pluripotency and differentiation. As an example, siRNA knockdown of MageB16, a gene highly expressed in the pluripotent state, has proven its influence in inducing differentiation when its function is repressed.
The continued progress in developing technological platforms, availability of many published experimental datasets, as well as different statistical methods to analyze those data have allowed approaching the same research question using various methods simultaneously. To get the best out of all these alternatives, we need to integrate their results in an unbiased manner. Prioritized gene lists are a common result presentation method in genomic data analysis applications. Thus, the rank aggregation methods can become a useful and general solution for the integration task.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.