The Journal of Visualized Experiments (JoVE) is a peer reviewed, PubMed-indexed video journal. Our mission is to increase the productivity of scientific research.

Recommend to Librarian

In JoVE (1)

Other Publications (17)

Articles by Victor Ruotti in JoVE

 JoVE General

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA


JoVE 3340 10/27/2011

1Regenerative Biology, Morgridge Institute for Research, 2Department of Cell & Regenerative Biology, University of Wisconsin, 3Department of Molecular, Cellular, & Regenerative Biology, University of California

Here we describe a method for preparation of both single read and paired end Illumina mRNA-Seq sequencing libraries for gene expression analysis based on T7 linear RNA amplification. This protocol requires only 10 nanograms of starting total RNA and generates highly consistent libraries representing whole transcripts.

Other articles by Victor Ruotti on PubMed

Integrative Genomics: in Silico Coupling of Rat Physiology and Complex Traits with Mouse and Human Data

Integration of the large variety of genome maps from several organisms provides the mechanism by which physiological knowledge obtained in model systems such as the rat can be projected onto the human genome to further the research on human disease. The release of the rat genome sequence provides new information for studies using the rat model and is a key reference against which existing and new rat physiological results can be aligned. Previously, we described comparative maps of the rat, mouse, and human based on EST sequence comparisons combined with radiation hybrid maps. Here, we use new data and introduce the Integrated Genomics Environment, an extensive database of curated and integrated maps, markers, and physiological results. These results are integrated by using VCMapview, a java-based map integration and visualization tool. This unique environment allows researchers to relate results from cytogenetic, genetic, and radiation hybrid studies to the genome sequence and compare regions of interest between human, mouse, and rat. Integrating rat physiology with mouse genetics and clinical results from human by using the respective genomes provides a novel route to capitalize on comparative genomics and the strengths of model organism biology.

ChromSorter PC: a Database of Chromosomal Regions Associated with Human Prostate Cancer

Our increasing use of genetic and genomic strategies to understand human prostate cancer means that we need access to simplified and integrated information present in the associated biomedical literature. In particular, microarray gene expression studies and associated genetic mapping studies in prostate cancer would benefit from a generalized understanding of the prior work associated with this disease. This would allow us to focus subsequent laboratory studies to genomic regions already related to prostate cancer by other scientific methods. We have developed a database of prostate cancer related chromosomal information from the existing biomedical literature. The input material was based on a broad literature search with subsequent hand annotation of information relevant to prostate cancer.

ProMoST (Protein Modification Screening Tool): a Web-based Tool for Mapping Protein Modifications on Two-dimensional Gels

ProMoST is a flexible web tool that calculates the effect of single or multiple posttranslational modifications (PTMs) on protein isoelectric point (pI) and molecular weight and displays the calculated patterns as two-dimensional (2D) gel images. PTMs of proteins control many biological regulatory and signaling mechanisms and 2D gel electrophoresis is able to resolve many PTM-induced isoforms, such as those due to phosphorylation, acetylation, deamination, alkylation, cysteine oxidation or tyrosine nitration. These modifications cause changes in the pI of the protein by adding, removing or changing titratable groups. Proteins differ widely in buffering capacity and pI and therefore the same PTMs may give rise to quite different patterns of pI shifts in different proteins. It is impossible by visual inspection of a pattern of spots on a gel to determine which modifications are most likely to be present. The patterns of PTM shifts for different proteins can be calculated and are often quite distinctive. The theoretical gel images produced by ProMoST can be compared to the experimental 2D gel results to implicate probable PTMs and focus efforts on more detailed study of modified proteins. ProMoST has been implemented as cgi script in Perl available on a WWW server at http://proteomics.mcw.edu/promost.

Automated Analysis of Conserved Syntenies for the Zebrafish Genome

DeNovoID: a Web-based Tool for Identifying Peptides from Sequence and Mass Tags Deduced from De Novo Peptide Sequencing by Mass Spectroscopy

One of the core activities of high-throughput proteomics is the identification of peptides from mass spectra. Some peptides can be identified using spectral matching programs like Sequest or Mascot, but many spectra do not produce high quality database matches. De novo peptide sequencing is an approach to determine partial peptide sequences for some of the unidentified spectra. A drawback of de novo peptide sequencing is that it produces a series of ordered and disordered sequence tags and mass tags rather than a complete, non-degenerate peptide amino acid sequence. This incomplete data is difficult to use in conventional search programs such as BLAST or FASTA. DeNovoID is a program that has been specifically designed to use degenerate amino acid sequence and mass data derived from MS experiments to search a peptide database. Since the algorithm employed depends on the amino acid composition of the peptide and not its sequence, DeNovoID does not have to consider all possible sequences, but rather a smaller number of compositions consistent with a spectrum. DeNovoID also uses a geometric indexing scheme that reduces the number of calculations required to determine the best peptide match in the database. DeNovoID is available at http://proteomics.mcw.edu/denovoid.

Tools and Strategies for Physiological Genomics: the Rat Genome Database

The broad goal of physiological genomics research is to link genes to their functions using appropriate experimental and computational techniques. Modern genomics experiments enable the generation of vast quantities of data, and interpretation of this data requires the integration of information derived from many diverse sources. Computational biology and bioinformatics offer the ability to manage and channel this information torrent. The Rat Genome Database (RGD; http://rgd.mcw.edu) has developed computational tools and strategies specifically supporting the goal of linking genes to their functional roles in rat and, using comparative genomics, to human and mouse. We present an overview of the database with a focus on these unique computational tools and describe strategies for the use of these resources in the area of physiological genomics.

Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells

Somatic cell nuclear transfer allows trans-acting factors present in the mammalian oocyte to reprogram somatic cell nuclei to an undifferentiated state. We show that four factors (OCT4, SOX2, NANOG, and LIN28) are sufficient to reprogram human somatic cells to pluripotent stem cells that exhibit the essential characteristics of embryonic stem (ES) cells. These induced pluripotent human stem cells have normal karyotypes, express telomerase activity, express cell surface markers and genes that characterize human ES cells, and maintain the developmental potential to differentiate into advanced derivatives of all three primary germ layers. Such induced pluripotent human cell lines should be useful in the production of new disease models and in drug development, as well as for applications in transplantation medicine, once technical limitations (for example, mutation through viral integration) are eliminated.

Whole-genome Analysis of Histone H3 Lysine 4 and Lysine 27 Methylation in Human Embryonic Stem Cells

We mapped Polycomb-associated H3K27 trimethylation (H3K27me3) and Trithorax-associated H3K4 trimethylation (H3K4me3) across the whole genome in human embryonic stem (ES) cells. The vast majority of H3K27me3 colocalized on genes modified with H3K4me3. These commodified genes displayed low expression levels and were enriched in developmental function. Another significant set of genes lacked both modifications and was also expressed at low levels in ES cells but was enriched for gene function in physiological responses rather than development. Commodified genes could change expression levels rapidly during differentiation, but so could a substantial number of genes in other modification categories. SOX2, POU5F1, and NANOG, pluripotency-associated genes, shifted from modification by H3K4me3 alone to colocalization of both modifications as they were repressed during differentiation. Our results demonstrate that H3K27me3 modifications change during early differentiation, both relieving existing repressive domains and imparting new ones, and that colocalization with H3K4me3 is not restricted to pluripotent cells.

A Study of the Relationships Between Oligonucleotide Properties and Hybridization Signal Intensities from NimbleGen Microarray Datasets

Well-defined relationships between oligonucleotide properties and hybridization signal intensities (HSI) can aid chip design, data normalization and true biological knowledge discovery. We clarify these relationships using the data from two microarray experiments containing over three million probes from 48 high-density chips. We find that melting temperature (T(m)) has the most significant effect on HSI while length for the long oligonucleotides studied has very little effect. Analysis of positional effect using a linear model provides evidence that the protruding ends of probes contribute more than tethered ends to HSI, which is further validated by specifically designed match fragment sliding and extension experiments. The impact of sequence similarity (SeqS) on HSI is not significant in comparison with other oligonucleotide properties. Using regression and regression tree analysis, we prioritize these oligonucleotide properties based on their effects on HSI. The implications of our discoveries for the design of unbiased oligonucleotides are discussed. We propose that isothermal probes designed by varying the length is a viable strategy to reduce sequence bias, though imposing selection constraints on other oligonucleotide properties is also essential.

Molecular Profiling Reveals Similarities and Differences Between Primitive Subsets of Hematopoietic Cells Generated in Vitro from Human Embryonic Stem Cells and in Vivo During Embryogenesis

Cellular and molecular changes that occur during the genesis of the hematopoietic system and hematopoietic stem cells in the human embryo are mostly inaccessible to study and remain poorly understood. To address this gap we have exploited the human embryonic stem cell (hESC) system to molecularly characterize the global transcriptomes of the two functionally discreet and phenotypically separable populations of multipotent hematopoietic cells that first appear when hESCs are induced to differentiate on OP9 cells.

ProbeMatch: Rapid Alignment of Oligonucleotides to Genome Allowing Both Gaps and Mismatches

SUMMARY: We have developed a tool, called ProbeMatch, for matching a large set of oligonucleotide sequences against a genome database using gapped alignments. Unlike most of the existing tools such as ELAND which only perform ungapped alignments allowing at most two mismatches, ProbeMatch generates both ungapped and gapped alignments allowing up to three errors including insertion, deletion and mismatch. To speedup sequence alignment, ProbeMatch uses gapped q-grams and q-grams of various patterns to identify target hits to a query sequence. This approach results in fewer initial sequences to examine with no loss in sensitivity. ProbeMatch has been used to align 169,095 Illumina GAII reads against the human genome, which could not be mapped by ELAND, and found alignments for 28,625 reads of the 169,095 reads in less than 3 h. AVAILABILITY: Source code is freely available at (http://www.cs.wisc.edu/~jignesh/probematch/).

Human DNA Methylomes at Base Resolution Show Widespread Epigenomic Differences

DNA cytosine methylation is a central epigenetic modification that has essential roles in cellular processes including genome regulation, development and disease. Here we present the first genome-wide, single-base-resolution maps of methylated cytosines in a mammalian genome, from both human embryonic stem cells and fetal fibroblasts, along with comparative analysis of messenger RNA and small RNA components of the transcriptome, several histone modifications, and sites of DNA-protein interaction for several key regulatory factors. Widespread differences were identified in the composition and patterning of cytosine methylation between the two genomes. Nearly one-quarter of all methylation identified in embryonic stem cells was in a non-CG context, suggesting that embryonic stem cells may use different methylation mechanisms to affect gene regulation. Methylation in non-CG contexts showed enrichment in gene bodies and depletion in protein binding sites and enhancers. Non-CG methylation disappeared upon induced differentiation of the embryonic stem cells, and was restored in induced pluripotent stem cells. We identified hundreds of differentially methylated regions proximal to genes involved in pluripotency and differentiation, and widespread reduced methylation levels in fibroblasts associated with lower transcriptional activity. These reference epigenomes provide a foundation for future studies exploring this key epigenetic modification in human disease and development.

RNA-Seq Gene Expression Estimation with Read Mapping Uncertainty

MOTIVATION: RNA-Seq is a promising new technology for accurately measuring gene expression levels. Expression estimation with RNA-Seq requires the mapping of relatively short sequencing reads to a reference genome or transcript set. Because reads are generally shorter than transcripts from which they are derived, a single read may map to multiple genes and isoforms, complicating expression analyses. Previous computational methods either discard reads that map to multiple locations or allocate them to genes heuristically. RESULTS: We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNA-Seq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. Unlike previous methods, our method is capable of modeling non-uniform read distributions. Simulations with our method indicate that a read length of 20-25 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed.

Distinct Epigenomic Landscapes of Pluripotent and Lineage-committed Human Cells

Human embryonic stem cells (hESCs) share an identical genome with lineage-committed cells, yet possess the remarkable properties of self-renewal and pluripotency. The diverse cellular properties in different cells have been attributed to their distinct epigenomes, but how much epigenomes differ remains unclear. Here, we report that epigenomic landscapes in hESCs and lineage-committed cells are drastically different. By comparing the chromatin-modification profiles and DNA methylomes in hESCs and primary fibroblasts, we find that nearly one-third of the genome differs in chromatin structure. Most changes arise from dramatic redistributions of repressive H3K9me3 and H3K27me3 marks, which form blocks that significantly expand in fibroblasts. A large number of potential regulatory sequences also exhibit a high degree of dynamics in chromatin modifications and DNA methylation. Additionally, we observe novel, context-dependent relationships between DNA methylation and chromatin modifications. Our results provide new insights into epigenetic mechanisms underlying properties of pluripotency and cell fate commitment.

Highly Consistent, Fully Representative MRNA-Seq Libraries from Ten Nanograms of Total RNA

Preparation of an Illumina sequencing library for gene expression analysis (mRNA-Seq) requires microgram amounts of starting total RNA or PCR-based amplification. Here we describe a protocol based on T7 linear RNA amplification that does not introduce significant bias, requires only 10 ng total RNA, and generates a directional, fully representative, whole-transcript mRNA-Seq Illumina library that is highly consistent across over three orders of magnitude of input RNA.

Proteomic and Phosphoproteomic Comparison of Human ES and IPS Cells

Combining high-mass-accuracy mass spectrometry, isobaric tagging and software for multiplexed, large-scale protein quantification, we report deep proteomic coverage of four human embryonic stem cell and four induced pluripotent stem cell lines in biological triplicate. This 24-sample comparison resulted in a very large set of identified proteins and phosphorylation sites in pluripotent cells. The statistical analysis afforded by our approach revealed subtle but reproducible differences in protein expression and protein phosphorylation between embryonic stem cells and induced pluripotent cells. Merging these results with RNA-seq analysis data, we found functionally related differences across each tier of regulation. We also introduce the Stem Cell-Omics Repository (SCOR), a resource to collate and display quantitative information across multiple planes of measurement, including mRNA, protein and post-translational modifications.

Chemically Defined Conditions for Human IPSC Derivation and Culture

We re-examine the individual components for human embryonic stem cell (ESC) and induced pluripotent stem cell (iPSC) culture and formulate a cell culture system in which all protein reagents for liquid media, attachment surfaces and splitting are chemically defined. A major improvement is the lack of a serum albumin component, as variations in either animal- or human-sourced albumin batches have previously plagued human ESC and iPSC culture with inconsistencies. Using this new medium (E8) and vitronectin-coated surfaces, we demonstrate improved derivation efficiencies of vector-free human iPSCs with an episomal approach. This simplified E8 medium should facilitate both the research use and clinical applications of human ESCs and iPSCs and their derivatives, and should be applicable to other reprogramming methods.

Waiting
simple hit counter