Estrogen plays vital roles in mammary gland development and breast cancer progression. It mediates its function by binding to and activating the estrogen receptors (ERs), ERα, and ERβ. ERα is frequently upregulated in breast cancer and drives the proliferation of breast cancer cells. The ERs function as transcription factors and regulate gene expression. Whereas ERα's regulation of protein-coding genes is well established, its regulation of noncoding microRNA (miRNA) is less explored. miRNAs play a major role in the post-transcriptional regulation of genes, inhibiting their translation or degrading their mRNA. miRNAs can function as oncogenes or tumor suppressors and are also promising biomarkers. Among the miRNA assays available, microarray and quantitative real-time polymerase chain reaction (qPCR) have been extensively used to detect and quantify miRNA levels. To identify miRNAs regulated by estrogen signaling in breast cancer, their expression in ERα-positive breast cancer cell lines were compared before and after estrogen-activation using both the µParaflo-microfluidic microarrays and Dual Labeled Probes-low density arrays. Results were validated using specific qPCR assays, applying both Cyanine dye-based and Dual Labeled Probes-based chemistry. Furthermore, a time-point assay was used to identify regulations over time. Advantages of the miRNA assay approach used in this study is that it enables a fast screening of mature miRNA regulations in numerous samples, even with limited sample amounts. The layout, including the specific conditions for cell culture and estrogen treatment, biological and technical replicates, and large-scale screening followed by in-depth confirmations using separate techniques, ensures a robust detection of miRNA regulations, and eliminates false positives and other artifacts. However, mutated or unknown miRNAs, or regulations at the primary and precursor transcript level, will not be detected. The method presented here represents a thorough investigation of estrogen-mediated miRNA regulation.
23 Related JoVE Articles!
Global Gene Expression Analysis Using a Zebrafish Oligonucleotide Microarray Platform
Institutions: Purdue University.
Gene microarray technology permits quantitative measurement and gene expression profiling of transcript levels on a genome-wide basis. Gene microarray technology is used in numerous biological disciplines in a variety of applications including global gene expression analysis in relation to developmental stage, to a disease state, and in toxic responses. Herein, we include a demonstration of global gene expression analysis using a comprehensive zebrafish-specific oligonucleotide microarray platform. The zebrafish expression microarray platform contains 385,000 probes, 60 base pairs in length, interrogating 37,157 targets with up to 12 probes per target. For this platform, all cDNA and genomic information available for the zebrafish was collected from various genomic databases including Ensembl (http://www.ensembl.org), VEGA (http://vega.sanger.ac.uk), UCSC (http://genome.ucsc.edu), and ZFIN (http://www.zfin.org). As a result this expression array provides complete coverage of the current zebrafish transcriptome. The zebrafish expression microarray was printed by Roche NimbleGen (Madison, WI). This technical demonstration includes the fluorescent labeling of a cDNA product, hybridization of the labeled cDNA product to the microarray platform, and array scanning for signal acquisition using the one color analysis strategy.
Developmental Biology, Issue 30, zebrafish, microarray, genomics, gene expression, RNA, oligonucleotide
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Demonstrating a Multi-drug Resistant Mycobacterium tuberculosis Amplification Microarray
Institutions: Akonni Biosystems, Inc..
Simplifying microarray workflow is a necessary first step for creating MDR-TB microarray-based diagnostics that can be routinely used in lower-resource environments. An amplification microarray combines asymmetric PCR amplification, target size selection, target labeling, and microarray hybridization within a single solution and into a single microfluidic chamber. A batch processing method is demonstrated with a 9-plex asymmetric master mix and low-density gel element microarray for genotyping multi-drug resistant Mycobacterium tuberculosis
(MDR-TB). The protocol described here can be completed in 6 hr and provide correct genotyping with at least 1,000 cell equivalents of genomic DNA. Incorporating on-chip wash steps is feasible, which will result in an entirely closed amplicon method and system. The extent of multiplexing with an amplification microarray is ultimately constrained by the number of primer pairs that can be combined into a single master mix and still achieve desired sensitivity and specificity performance metrics, rather than the number of probes that are immobilized on the array. Likewise, the total analysis time can be shortened or lengthened depending on the specific intended use, research question, and desired limits of detection. Nevertheless, the general approach significantly streamlines microarray workflow for the end user by reducing the number of manually intensive and time-consuming processing steps, and provides a simplified biochemical and microfluidic path for translating microarray-based diagnostics into routine clinical practice.
Immunology, Issue 86, MDR-TB, gel element microarray, closed amplicon, drug resistance, rifampin, isoniazid, streptomycin, ethambutol
Polysome Fractionation and Analysis of Mammalian Translatomes on a Genome-wide Scale
Institutions: McGill University, Karolinska Institutet, McGill University.
mRNA translation plays a central role in the regulation of gene expression and represents the most energy consuming process in mammalian cells. Accordingly, dysregulation of mRNA translation is considered to play a major role in a variety of pathological states including cancer. Ribosomes also host chaperones, which facilitate folding of nascent polypeptides, thereby modulating function and stability of newly synthesized polypeptides. In addition, emerging data indicate that ribosomes serve as a platform for a repertoire of signaling molecules, which are implicated in a variety of post-translational modifications of newly synthesized polypeptides as they emerge from the ribosome, and/or components of translational machinery. Herein, a well-established method of ribosome fractionation using sucrose density gradient centrifugation is described. In conjunction with the in-house developed “anota” algorithm this method allows direct determination of differential translation of individual mRNAs on a genome-wide scale. Moreover, this versatile protocol can be used for a variety of biochemical studies aiming to dissect the function of ribosome-associated protein complexes, including those that play a central role in folding and degradation of newly synthesized polypeptides.
Biochemistry, Issue 87, Cells, Eukaryota, Nutritional and Metabolic Diseases, Neoplasms, Metabolic Phenomena, Cell Physiological Phenomena, mRNA translation, ribosomes,
protein synthesis, genome-wide analysis, translatome, mTOR, eIF4E, 4E-BP1
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
Institutions: Universidade de São Paulo.
electroporation of the chick neural tube is a fast and inexpensive method for identification of gene function during neural development. Genome wide analysis of differentially expressed transcripts after such an experimental manipulation has the potential to uncover an almost complete picture of the downstream effects caused by the transfected construct. This work describes a simple method for comparing transcriptomes from samples of transfected embryonic spinal cords comprising all steps between electroporation and identification of differentially expressed transcripts. The first stage consists of guidelines for electroporation and instructions for dissection of transfected spinal cord halves from HH23 embryos in ribonuclease-free environment and extraction of high-quality RNA samples suitable for transcriptome sequencing. The next stage is that of bioinformatic analysis with general guidelines for filtering and comparison of RNA-Seq datasets in the Galaxy public server, which eliminates the need of a local computational structure for small to medium scale experiments. The representative results show that the dissection methods generate high quality RNA samples and that the transcriptomes obtained from two control samples are essentially the same, an important requirement for detection of differential expression genes in experimental samples. Furthermore, one example is provided where experimental overexpression of a DNA construct can be visually verified after comparison with control samples. The application of this method may be a powerful tool to facilitate new discoveries on the function of neural factors involved in spinal cord early development.
Developmental Biology, Issue 93, chicken embryo, in ovo electroporation, spinal cord, RNA-Seq, transcriptome profiling, Galaxy workflow
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
Enhanced Reduced Representation Bisulfite Sequencing for Assessment of DNA Methylation at Base Pair Resolution
Institutions: Weill Cornell Medical College, Weill Cornell Medical College, Weill Cornell Medical College, University of Michigan.
DNA methylation pattern mapping is heavily studied in normal and diseased tissues. A variety of methods have been established to interrogate the cytosine methylation patterns in cells. Reduced representation of whole genome bisulfite sequencing was developed to detect quantitative base pair resolution cytosine methylation patterns at GC-rich genomic loci. This is accomplished by combining the use of a restriction enzyme followed by bisulfite conversion. Enhanced Reduced Representation Bisulfite Sequencing (ERRBS) increases the biologically relevant genomic loci covered and has been used to profile cytosine methylation in DNA from human, mouse and other organisms. ERRBS initiates with restriction enzyme digestion of DNA to generate low molecular weight fragments for use in library preparation. These fragments are subjected to standard library construction for next generation sequencing. Bisulfite conversion of unmethylated cytosines prior to the final amplification step allows for quantitative base resolution of cytosine methylation levels in covered genomic loci. The protocol can be completed within four days. Despite low complexity in the first three bases sequenced, ERRBS libraries yield high quality data when using a designated sequencing control lane. Mapping and bioinformatics analysis is then performed and yields data that can be easily integrated with a variety of genome-wide platforms. ERRBS can utilize small input material quantities making it feasible to process human clinical samples and applicable in a range of research applications. The video produced demonstrates critical steps of the ERRBS protocol.
Genetics, Issue 96, Epigenetics, bisulfite sequencing, DNA methylation, genomic DNA, 5-methylcytosine, high-throughput
Genome-wide Snapshot of Chromatin Regulators and States in Xenopus Embryos by ChIP-Seq
Institutions: MRC National Institute for Medical Research.
The recruitment of chromatin regulators and the assignment of chromatin states to specific genomic loci are pivotal to cell fate decisions and tissue and organ formation during development. Determining the locations and levels of such chromatin features in vivo
will provide valuable information about the spatio-temporal regulation of genomic elements, and will support aspirations to mimic embryonic tissue development in vitro
. The most commonly used method for genome-wide and high-resolution profiling is chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq). This protocol outlines how yolk-rich embryos such as those of the frog Xenopus
can be processed for ChIP-Seq experiments, and it offers simple command lines for post-sequencing analysis. Because of the high efficiency with which the protocol extracts nuclei from formaldehyde-fixed tissue, the method allows easy upscaling to obtain enough ChIP material for genome-wide profiling. Our protocol has been used successfully to map various DNA-binding proteins such as transcription factors, signaling mediators, components of the transcription machinery, chromatin modifiers and post-translational histone modifications, and for this to be done at various stages of embryogenesis. Lastly, this protocol should be widely applicable to other model and non-model organisms as more and more genome assemblies become available.
Developmental Biology, Issue 96, Chromatin immunoprecipitation, next-generation sequencing, ChIP-Seq, developmental biology, Xenopus embryos, cross-linking, transcription factor, post-sequencing analysis, DNA occupancy, metagene, binding motif, GO term
A Fast and Reliable Pipeline for Bacterial Transcriptome Analysis Case study: Serine-dependent Gene Regulation in Streptococcus pneumoniae
Institutions: University of Groningen, Government College University Faisalabad.
Gene expression and its regulation are very important to understand the behavior of cells under different conditions. Various techniques are used nowadays to study gene expression, but most are limited in terms of providing an overall picture of the expression of the whole transcriptome. DNA microarrays offer a fast and economic research technology, which gives a full overview of global gene expression and have a vast number of applications including identification of novel genes and transcription factor binding sites, characterization of transcriptional activity of the cells and also help in analyzing thousands of genes (in a single experiment). In the present study, the conditions for bacterial transcriptome analysis from cell harvest to DNA microarray analysis have been optimized. Taking into account the time, costs and accuracy of the experiments, this technology platform proves to be very useful and universally applicabale for studying bacterial transcriptomes. Here, we perform DNA microarray analysis with Streptococcus pneumoniae
as a case-study by comparing the transcriptional responses of S. pneumoniae
grown in the presence of varying L-serine concentrations in the medium. Total RNA was isolated by using a Macaloid method using an RNA isolation kit and the quality of RNA was checked by using an RNA quality check kit. cDNA was prepared using reverse transcriptase and the cDNA samples were labelled using one of two amine-reactive fluorescent dyes. Homemade DNA microarray slides were used for hybridization of the labelled cDNA samples and microarray data were analyzed by using a cDNA microarray data pre-processing framework (Microprep
). Finally, Cyber-T was used to analyze the data generated using Microprep
for the identification of statistically significant differentially expressed genes. Furthermore, in-house built software packages (PePPER, FIVA, DISCLOSE, PROSECUTOR, Genome2D) were used to analyze data.
Molecular Biology, Issue 98, DNA microarrays, gene expression, transcriptomics, bioinformatics, data analysis, pneumococcus, L-serine
Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
Institutions: Joint Unit Hospices de Lyon-bioMérieux, BioMérieux, Hospices Civils de Lyon, Lyon 1 University, BioMérieux, Hospices Civils de Lyon, Hospices Civils de Lyon.
The prostate-specific antigen (PSA) is the main diagnostic biomarker for prostate cancer in clinical use, but it lacks specificity and sensitivity, particularly in low dosage values1
. ‘How to use PSA' remains a current issue, either for diagnosis as a gray zone corresponding to a concentration in serum of 2.5-10 ng/ml which does not allow a clear differentiation to be made between cancer and noncancer2
or for patient follow-up as analysis of post-operative PSA kinetic parameters can pose considerable challenges for their practical application3,4
. Alternatively, noncoding RNAs (ncRNAs) are emerging as key molecules in human cancer, with the potential to serve as novel markers of disease, e.g.
PCA3 in prostate cancer5,6
and to reveal uncharacterized aspects of tumor biology. Moreover, data from the ENCODE project published in 2012 showed that different RNA types cover about 62% of the genome. It also appears that the amount of transcriptional regulatory motifs is at least 4.5x higher than the one corresponding to protein-coding exons. Thus, long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) constitute a wide range of putative/candidate transcriptional regulatory sequences, as it is their primary function in infectious retroviruses. HERVs, which are spread throughout the human genome, originate from ancestral and independent infections within the germ line, followed by copy-paste propagation processes and leading to multicopy families occupying 8% of the human genome (note that exons span 2% of our genome). Some HERV loci still express proteins that have been associated with several pathologies including cancer7-10
. We have designed a high-density microarray, in Affymetrix format, aiming to optimally characterize individual HERV loci expression, in order to better understand whether they can be active, if they drive ncRNA transcription or modulate coding gene expression. This tool has been applied in the prostate cancer field (Figure 1
Medicine, Issue 81, Cancer Biology, Genetics, Molecular Biology, Prostate, Retroviridae, Biomarkers, Pharmacological, Tumor Markers, Biological, Prostatectomy, Microarray Analysis, Gene Expression, Diagnosis, Human Endogenous Retroviruses, HERV, microarray, Transcriptome, prostate cancer, Affymetrix
Simultaneous Multicolor Imaging of Biological Structures with Fluorescence Photoactivation Localization Microscopy
Institutions: University of Maine.
Localization-based super resolution microscopy can be applied to obtain a spatial map (image) of the distribution of individual fluorescently labeled single molecules within a sample with a spatial resolution of tens of nanometers. Using either photoactivatable (PAFP) or photoswitchable (PSFP) fluorescent proteins fused to proteins of interest, or organic dyes conjugated to antibodies or other molecules of interest, fluorescence photoactivation localization microscopy (FPALM) can simultaneously image multiple species of molecules within single cells. By using the following approach, populations of large numbers (thousands to hundreds of thousands) of individual molecules are imaged in single cells and localized with a precision of ~10-30 nm. Data obtained can be applied to understanding the nanoscale spatial distributions of multiple protein types within a cell. One primary advantage of this technique is the dramatic increase in spatial resolution: while diffraction limits resolution to ~200-250 nm in conventional light microscopy, FPALM can image length scales more than an order of magnitude smaller. As many biological hypotheses concern the spatial relationships among different biomolecules, the improved resolution of FPALM can provide insight into questions of cellular organization which have previously been inaccessible to conventional fluorescence microscopy. In addition to detailing the methods for sample preparation and data acquisition, we here describe the optical setup for FPALM. One additional consideration for researchers wishing to do super-resolution microscopy is cost: in-house setups are significantly cheaper than most commercially available imaging machines. Limitations of this technique include the need for optimizing the labeling of molecules of interest within cell samples, and the need for post-processing software to visualize results. We here describe the use of PAFP and PSFP expression to image two protein species in fixed cells. Extension of the technique to living cells is also described.
Basic Protocol, Issue 82, Microscopy, Super-resolution imaging, Multicolor, single molecule, FPALM, Localization microscopy, fluorescent proteins
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing
DNA Microarrays: Sample Quality Control, Array Hybridization and Scanning
Institutions: University of California, Davis.
Microarray expression profiling of the nervous system provides a powerful approach to identifying gene activities in different stages of development, different physiological or pathological states, response to therapy, and, in general, any condition that is being experimentally tested1
. Expression profiling of neural tissues requires isolation of high quality RNA, amplification of the isolated RNA and hybridization to DNA microarrays. In this article we describe protocols for reproducible microarray experiments from brain tumor tissue2
. We will start by performing a quality control analysis of isolated RNA samples with Agilent's 2100 Bioanalyzer "lab-on-a-chip" technology. High quality RNA samples are critical for the success of any microarray experiment, and the 2100 Bioanalyzer provides a quick, quantitative measurement of the sample quality. RNA samples are then amplified and labeled by performing reverse transcription to obtain cDNA, followed by in vitro transcription in the presence of labeled nucleotides to produce labeled cRNA. By using a dual-color labeling kit, we will label our experimental sample with Cy3 and a reference sample with Cy5. Both samples will then be combined and hybridized to Agilent's 4x44 K arrays. Dual-color arrays offer the advantage of a direct comparison between two RNA samples, thereby increasing the accuracy of the measurements, in particular for small changes in expression levels, because the two RNA samples are hybridized competitively to a single microarray. The arrays will be scanned at the two corresponding wavelengths, and the ratio of Cy3 to Cy5 signal for each feature will be used as a direct measurement of the relative abundance of the corresponding mRNA. This analysis identifies genes that are differentially expressed in response to the experimental conditions being tested.
Genetics, Issue 49, microarray, RNA, expression profiling, dual-color labeling
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Determining Genetic Expression Profiles in C. elegans Using Microarray and Real-time PCR
Institutions: Southwestern Oklahoma State University.
Synapses are composed of a presynaptic active zone in the signaling cell and a postsynaptic terminal in the target cell. In the case of chemical synapses, messages are carried by neurotransmitters released from presynaptic terminals and received by receptors on postsynaptic cells. Our previous research in Caenorhabditis elegans
has shown that VSM-1 negatively regulates exocytosis. Additionally, analysis of synapses in vsm-1
mutants showed that animals lacking a fully functional VSM-1 have increased synaptic connectivity. Based on these preliminary findings, we hypothesized that C. elegans
VSM-1 may play a crucial role in synaptogenesis. To test this hypothesis, double-labeled microarray analysis was performed, and gene expression profiles were determined. First, total RNA was isolated, reversely transcribed to cDNA, and hybridized to the DNA microarrays. Then, in-silico analysis of fluorescent probe hybridization revealed significant induction of many genes coding for members of the major sperm protein family (MSP) in mutants with enhanced synaptogenesis. MSPs are the major component of sperm in C. elegans
and appear to signal nematode oocyte maturation and ovulation . In fruit flies, Chai and colleagues 1
demonstrated that MSP-like molecules regulate presynaptic bouton number and size at the neuromuscular junction. Moreover, analysis performed by Tsuda and coworkers 2
suggested that MSPs may act as ligands for Eph receptors and trigger receptor tyrosine kinase signaling cascades. Lastly, real time PCR analysis corroborated that the gene coding for MSP-32 is induced in vsm-1(ok1468)
mutants. Taken together, research performed by our laboratory has shown that vsm-1
mutants have a significant increase in synaptic density, which could be mediated by MSP-32 signaling.
Molecular Biology, Issue 53, microarray, C. elegans, real-time PCR, neuroscience
Performing Custom MicroRNA Microarray Experiments
Institutions: University of Minnesota , University of Minnesota .
microRNAs (miRNAs) are a large family of ˜ 22 nucleotides (nt) long RNA molecules that are widely expressed in eukaryotes 1
. Complex genomes encode at least hundreds of miRNAs, which primarily inhibit the expression of a vast number of target genes post-transcriptionally 2, 3
. miRNAs control a broad range of biological processes 1
. In addition, altered miRNA expression has been associated with human diseases such as cancers, and miRNAs may serve as biomarkers for diseases and prognosis 4, 5
. It is important, therefore, to understand the expression and functions of miRNAs under many different conditions.
Three major approaches have been employed to profile miRNA expression: real-time PCR, microarray, and deep sequencing. The technique of miRNA microarray has the advantage of being high-throughput, generally less expensive, and most of the experimental and analysis steps can be carried out in a molecular biology laboratory at most universities, medical schools and associated hospitals. Here, we describe a method for performing custom miRNA microarray experiments. A miRNA probe set will be printed on glass slides to produce miRNA microarrays. RNA is isolated using a method or reagent that preserves small RNA species, and then labeled with a fluorescence dye. As a control, reference DNA oligonucleotides corresponding to a subset of miRNAs are also labeled with a different fluorescence dye. The reference DNA will serve to demonstrate the quality of the slide and hybridization and will also be used for data normalization. The RNA and DNA are mixed and hybridized to a microarray slide containing probes for most of the miRNAs in the database. After washing, the slide is scanned to obtain images, and intensities of the individual spots quantified. These raw signals will be further processed and analyzed as the expression data of the corresponding miRNAs. Microarray slides can be stripped and regenerated to reduce the cost of microarrays and to enhance the consistency of microarray experiments. The same principles and procedures are applicable to other types of custom microarray experiments.
Molecular Biology, Issue 56, Genetics, microRNA, custom microarray, oligonucleotide probes, RNA labeling
Chemically-blocked Antibody Microarray for Multiplexed High-throughput Profiling of Specific Protein Glycosylation in Complex Samples
Institutions: Institute for Hepatitis and Virus Research, Thomas Jefferson University , Drexel University College of Medicine, Van Andel Research Institute, Serome Biosciences Inc..
In this study, we describe an effective protocol for use in a multiplexed high-throughput antibody microarray with glycan binding protein detection that allows for the glycosylation profiling of specific proteins. Glycosylation of proteins is the most prevalent post-translational modification found on proteins, and leads diversified modifications of the physical, chemical, and biological properties of proteins. Because the glycosylation machinery is particularly susceptible to disease progression and malignant transformation, aberrant glycosylation has been recognized as early detection biomarkers for cancer and other diseases. However, current methods to study protein glycosylation typically are too complicated or expensive for use in most normal laboratory or clinical settings and a more practical method to study protein glycosylation is needed. The new protocol described in this study makes use of a chemically blocked antibody microarray with glycan-binding protein (GBP) detection and significantly reduces the time, cost, and lab equipment requirements needed to study protein glycosylation. In this method, multiple immobilized glycoprotein-specific antibodies are printed directly onto the microarray slides and the N-glycans on the antibodies are blocked. The blocked, immobilized glycoprotein-specific antibodies are able to capture and isolate glycoproteins from a complex sample that is applied directly onto the microarray slides. Glycan detection then can be performed by the application of biotinylated lectins and other GBPs to the microarray slide, while binding levels can be determined using Dylight 549-Streptavidin. Through the use of an antibody panel and probing with multiple biotinylated lectins, this method allows for an effective glycosylation profile of the different proteins found in a given human or animal sample to be developed.
Glycosylation of protein, which is the most ubiquitous post-translational modification on proteins, modifies the physical, chemical, and biological properties of a protein, and plays a fundamental role in various biological processes1-6
. Because the glycosylation machinery is particularly susceptible to disease progression and malignant transformation, aberrant glycosylation has been recognized as early detection biomarkers for cancer and other diseases 7-12
. In fact, most current cancer biomarkers, such as the L3 fraction of α-1 fetoprotein (AFP) for hepatocellular carcinoma 13-15
, and CA199 for pancreatic cancer 16, 17
are all aberrant glycan moieties on glycoproteins. However, methods to study protein glycosylation have been complicated, and not suitable for routine laboratory and clinical settings. Chen et al.
has recently invented a chemically blocked antibody microarray with a glycan-binding protein (GBP) detection method for high-throughput and multiplexed profile glycosylation of native glycoproteins in a complex sample 18
. In this affinity based microarray method, multiple immobilized glycoprotein-specific antibodies capture and isolate glycoproteins from the complex mixture directly on the microarray slide, and the glycans on each individual captured protein are measured by GBPs. Because all normal antibodies contain N-glycans which could be recognized by most GBPs, the critical step of this method is to chemically block the glycans on the antibodies from binding to GBP. In the procedure, the cis
-diol groups of the glycans on the antibodies were first oxidized to aldehyde groups by using NaIO4
in sodium acetate buffer avoiding light. The aldehyde groups were then conjugated to the hydrazide group of a cross-linker, 4-(4-N-MaleimidoPhenyl)butyric acid Hydrazide HCl (MPBH), followed by the conjugation of a dipeptide, Cys-Gly, to the maleimide group of the MPBH. Thus, the cis-diol groups on glycans of antibodies were converted into bulky none hydroxyl groups, which hindered the lectins and other GBPs bindings to the capture antibodies. This blocking procedure makes the GBPs and lectins bind only to the glycans of captured proteins. After this chemically blocking, serum samples were incubated with the antibody microarray, followed by the glycans detection by using different biotinylated lectins and GBPs, and visualized with Cy3-streptavidin. The parallel use of an antibody panel and multiple lectin probing provides discrete glycosylation profiles of multiple proteins in a given sample 18-20
. This method has been used successfully in multiple different labs 1, 7, 13, 19-31
. However, stability of MPBH and Cys-Gly, complicated and extended procedure in this method affect the reproducibility, effectiveness and efficiency of the method. In this new protocol, we replaced both MPBH and Cys-Gly with one much more stable reagent glutamic acid hydrazide (Glu-hydrazide), which significantly improved the reproducibility of the method, simplified and shorten the whole procedure so that the it can be completed within one working day. In this new protocol, we describe the detailed procedure of the protocol which can be readily adopted by normal labs for routine protein glycosylation study and techniques which are necessary to obtain reproducible and repeatable results.
Molecular Biology, Issue 63, Glycoproteins, glycan-binding protein, specific protein glycosylation, multiplexed high-throughput glycan blocked antibody microarray
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Metabolic Labeling of Newly Transcribed RNA for High Resolution Gene Expression Profiling of RNA Synthesis, Processing and Decay in Cell Culture
Institutions: Max von Pettenkofer Institute, University of Cambridge, Ludwig-Maximilians-University Munich.
The development of whole-transcriptome microarrays and next-generation sequencing has revolutionized our understanding of the complexity of cellular gene expression. Along with a better understanding of the involved molecular mechanisms, precise measurements of the underlying kinetics have become increasingly important. Here, these powerful methodologies face major limitations due to intrinsic properties of the template samples they study, i.e.
total cellular RNA. In many cases changes in total cellular RNA occur either too slowly or too quickly to represent the underlying molecular events and their kinetics with sufficient resolution. In addition, the contribution of alterations in RNA synthesis, processing, and decay are not readily differentiated.
We recently developed high-resolution gene expression profiling to overcome these limitations. Our approach is based on metabolic labeling of newly transcribed RNA with 4-thiouridine (thus also referred to as 4sU-tagging) followed by rigorous purification of newly transcribed RNA using thiol-specific biotinylation and streptavidin-coated magnetic beads. It is applicable to a broad range of organisms including vertebrates, Drosophila
, and yeast. We successfully applied 4sU-tagging to study real-time kinetics of transcription factor activities, provide precise measurements of RNA half-lives, and obtain novel insights into the kinetics of RNA processing. Finally, computational modeling can be employed to generate an integrated, comprehensive analysis of the underlying molecular mechanisms.
Genetics, Issue 78, Cellular Biology, Molecular Biology, Microbiology, Biochemistry, Eukaryota, Investigative Techniques, Biological Phenomena, Gene expression profiling, RNA synthesis, RNA processing, RNA decay, 4-thiouridine, 4sU-tagging, microarray analysis, RNA-seq, RNA, DNA, PCR, sequencing
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
Institutions: University of Calgary , University of Calgary .
We demonstrate methods for the detection of architectural distortion in prior mammograms of interval-cancer cases based on analysis of the orientation of breast tissue patterns in mammograms. We hypothesize that architectural distortion modifies the normal orientation of breast tissue patterns in mammographic images before the formation of masses or tumors. In the initial steps of our methods, the oriented structures in a given mammogram are analyzed using Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Each detected site is then characterized using the node value, fractal dimension, and a measure of angular dispersion specifically designed to represent spiculating patterns associated with architectural distortion.
Our methods were tested with a database of 106 prior mammograms of 56 interval-cancer cases and 52 mammograms of 13 normal cases using the features developed for the characterization of architectural distortion, pattern classification via
quadratic discriminant analysis, and validation with the leave-one-patient out procedure. According to the results of free-response receiver operating characteristic analysis, our methods have demonstrated the capability to detect architectural distortion in prior mammograms, taken 15 months (on the average) before clinical diagnosis of breast cancer, with a sensitivity of 80% at about five false positives per patient.
Medicine, Issue 78, Anatomy, Physiology, Cancer Biology, angular spread, architectural distortion, breast cancer, Computer-Assisted Diagnosis, computer-aided diagnosis (CAD), entropy, fractional Brownian motion, fractal dimension, Gabor filters, Image Processing, Medical Informatics, node map, oriented texture, Pattern Recognition, phase portraits, prior mammograms, spectral analysis
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Phage Phenomics: Physiological Approaches to Characterize Novel Viral Proteins
Institutions: San Diego State University, San Diego State University, San Diego State University, San Diego State University, San Diego State University, Argonne National Laboratory, Broad Institute.
Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysis by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.
Immunology, Issue 100, phenomics, phage, viral metagenome, Multi-phenotype Assay Plates (MAPs), continuous culture, metabolomics