As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators.
To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (https://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
22 Related JoVE Articles!
Collection and Extraction of Saliva DNA for Next Generation Sequencing
Institutions: The Research Institute at Nationwide Children's Hospital, The Ohio State University, The Ohio State University.
The preferred source of DNA in human genetics research is blood, or cell lines derived from blood, as these sources yield large quantities of high quality DNA. However, DNA extraction from saliva can yield high quality DNA with little to no degradation/fragmentation that is suitable for a variety of DNA assays without the expense of a phlebotomist and can even be acquired through the mail. However, at present, no saliva DNA collection/extraction protocols for next generation sequencing have been presented in the literature. This protocol optimizes parameters of saliva collection/storage and DNA extraction to be of sufficient quality and quantity for DNA assays with the highest standards, including microarray genotyping and next generation sequencing.
Medicine, Issue 90,
DNA collection, saliva, DNA extraction, Next generation sequencing, DNA purification, DNA
Orthogonal Protein Purification Facilitated by a Small Bispecific Affinity Tag
Institutions: Royal Institute of Technology.
Due to the high costs associated with purification of recombinant proteins the protocols need to be rationalized. For high-throughput efforts there is a demand for general methods that do not require target protein specific optimization1
. To achieve this, purification tags that genetically can be fused to the gene of interest are commonly used2
. The most widely used affinity handle is the hexa-histidine tag, which is suitable for purification under both native and denaturing conditions3
. The metabolic burden for producing the tag is low, but it does not provide as high specificity as competing affinity chromatography based strategies1,2
Here, a bispecific purification tag with two different binding sites on a 46 amino acid, small protein domain has been developed. The albumin-binding domain is derived from Streptococcal protein G and has a strong inherent affinity to human serum albumin (HSA). Eleven surface-exposed amino acids, not involved in albumin-binding4
, were genetically randomized to produce a combinatorial library. The protein library with the novel randomly arranged binding surface (Figure 1) was expressed on phage particles to facilitate selection of binders by phage display technology. Through several rounds of biopanning against a dimeric Z-domain derived from Staphylococcal protein A5
, a small, bispecific molecule with affinity for both HSA and the novel target was identified6
The novel protein domain, referred to as ABDz1, was evaluated as a purification tag for a selection of target proteins with different molecular weight, solubility and isoelectric point. Three target proteins were expressed in Escherishia coli
with the novel tag fused to their N-termini and thereafter affinity purified. Initial purification on either a column with immobilized HSA or Z-domain resulted in relatively pure products. Two-step affinity purification with the bispecific tag resulted in substantial improvement of protein purity. Chromatographic media with the Z-domain immobilized, for example MabSelect SuRe, are readily available for purification of antibodies and HSA can easily be chemically coupled to media to provide the second matrix.
This method is especially advantageous when there is a high demand on purity of the recovered target protein. The bifunctionality of the tag allows two different chromatographic steps to be used while the metabolic burden on the expression host is limited due to the small size of the tag. It provides a competitive alternative to so called combinatorial tagging where multiple tags are used in combination1,7
Molecular Biology, Issue 59, Affinity chromatography, albumin-binding domain, human serum albumin, Z-domain
Genotyping of Plant and Animal Samples without Prior DNA Purification
Institutions: Thermo Fisher Scientific.
The Direct PCR approach facilitates PCR amplification directly from small amounts of unpurified samples, and is demonstrated here for several plant and animal tissues (Figure 1
). Direct PCR is based on specially engineered Thermo Scientific Phusion and Phire DNA Polymerases, which include a double-stranded DNA binding domain that gives them unique properties such as high tolerance of inhibitors.
PCR-based target DNA detection has numerous applications in plant research, including plant genotype analysis and verification of transgenes. PCR from plant tissues traditionally involves an initial DNA isolation step, which may require expensive or toxic reagents. The process is time consuming and increases the risk of cross contamination1, 2
. Conversely, by using Thermo Scientific Phire Plant Direct PCR Kit the target DNA can be easily detected, without prior DNA extraction. In the model demonstrated here, an example of derived cleaved amplified polymorphic sequence analysis (dCAPS)3,4
is performed directly from Arabidopsis
plant leaves. dCAPS genotyping assays can be used to identify single nucleotide polymorphisms (SNPs) by SNP allele-specific restriction endonuclease digestion3
Some plant samples tend to be more challenging when using Direct PCR methods as they contain components that interfere with PCR, such as phenolic compounds. In these cases, an additional step to remove the compounds is traditionally required2,5
. Here, this problem is overcome by using a quick and easy dilution protocol followed by Direct PCR amplification (Figure 1
). Fifteen year-old oak leaves are used as a model for challenging plants as the specimen contains high amounts of phenolic compounds including tannins.
Gene transfer into mice is broadly used to study the roles of genes in development, physiology and human disease. The use of these animals requires screening for the presence of the transgene, usually with PCR. Traditionally, this involves a time consuming DNA isolation step, during which DNA for PCR analysis is purified from ear, tail or toe tissues6,7
. However, with the Thermo Scientific Phire Animal Tissue Direct PCR Kit transgenic mice can be genotyped without prior DNA purification. In this protocol transgenic mouse genotyping is achieved directly from mouse ear tissues, as demonstrated here for a challenging example where only one primer set is used for amplification of two fragments differing greatly in size.
Genetics, Issue 67, Molecular Biology, Plant Biology, Medicine, Direct PCR, DNA amplification, DNA purification, dCAPS, PCR-based target DNA detection, genotyping, Arabidopsis, oak, mouse tissues
Isolation and Genome Analysis of Single Virions using 'Single Virus Genomics'
Institutions: The J. Craig Venter Institute.
Whole genome amplification and sequencing of single microbial cells enables genomic characterization without the need of cultivation 1-3
. Viruses, which are ubiquitous and the most numerous entities on our planet 4
and important in all environments 5
, have yet to be revealed via similar approaches. Here we describe an approach for isolating and characterizing the genomes of single virions called 'Single Virus Genomics' (SVG). SVG utilizes flow cytometry to isolate individual viruses and whole genome amplification to obtain high molecular weight genomic DNA (gDNA) that can be used in subsequent sequencing reactions.
Genetics, Issue 75, Microbiology, Immunology, Virology, Molecular Biology, Environmental Sciences, Genomics, environmental genomics, Single virus, single virus genomics, SVG, whole genome amplification, flow cytometry, viral ecology, virion, genome analysis, DNA, PCR, sequencing
Separation of Single-stranded DNA, Double-stranded DNA and RNA from an Environmental Viral Community Using Hydroxyapatite Chromatography
Institutions: The J. Craig Venter Institute, The J. Craig Venter Institute.
Viruses, particularly bacteriophages (phages), are the most numerous biological entities on Earth1,2
. Viruses modulate host cell abundance and diversity, contribute to the cycling of nutrients, alter host cell phenotype, and influence the evolution of both host cell and viral communities through the lateral transfer of genes 3
. Numerous studies have highlighted the staggering genetic diversity of viruses and their functional potential in a variety of natural environments.
Metagenomic techniques have been used to study the taxonomic diversity and functional potential of complex viral assemblages whose members contain single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and RNA genotypes 4-9
. Current library construction protocols used to study environmental DNA-containing or RNA-containing viruses require an initial nuclease treatment in order to remove nontargeted templates 10
. However, a comprehensive understanding of the collective gene complement of the virus community and virus diversity requires knowledge of all members regardless of genome composition. Fractionation of purified nucleic acid subtypes provides an effective mechanism by which to study viral assemblages without sacrificing a subset of the community’s genetic signature.
Hydroxyapatite, a crystalline form of calcium phosphate, has been employed in the separation of nucleic acids, as well as proteins and microbes, since the 1960s11
. By exploiting the charge interaction between the positively-charged Ca2+
ions of the hydroxyapatite and the negatively charged phosphate backbone of the nucleic acid subtypes, it is possible to preferentially elute each nucleic acid subtype independent of the others. We recently employed this strategy to independently fractionate the genomes of ssDNA, dsDNA and RNA-containing viruses in preparation of DNA sequencing 12
. Here, we present a method for the fractionation and recovery of ssDNA, dsDNA and RNA viral nucleic acids from mixed viral assemblages using hydroxyapatite chromotography.
Immunology, Issue 55, Hydroxyapatite, single-stranded DNA, double-stranded DNA, RNA, DNA, chromatography, viral ecology, virus, bacteriophage
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
Detecting Somatic Genetic Alterations in Tumor Specimens by Exon Capture and Massively Parallel Sequencing
Institutions: Memorial Sloan-Kettering Cancer Center, Memorial Sloan-Kettering Cancer Center.
Efforts to detect and investigate key oncogenic mutations have proven valuable to facilitate the appropriate treatment for cancer patients. The establishment of high-throughput, massively parallel "next-generation" sequencing has aided the discovery of many such mutations. To enhance the clinical and translational utility of this technology, platforms must be high-throughput, cost-effective, and compatible with formalin-fixed paraffin embedded (FFPE) tissue samples that may yield small amounts of degraded or damaged DNA. Here, we describe the preparation of barcoded and multiplexed DNA libraries followed by hybridization-based capture of targeted exons for the detection of cancer-associated mutations in fresh frozen and FFPE tumors by massively parallel sequencing. This method enables the identification of sequence mutations, copy number alterations, and select structural rearrangements involving all targeted genes. Targeted exon sequencing offers the benefits of high throughput, low cost, and deep sequence coverage, thus conferring high sensitivity for detecting low frequency mutations.
Molecular Biology, Issue 80, Molecular Diagnostic Techniques, High-Throughput Nucleotide Sequencing, Genetics, Neoplasms, Diagnosis, Massively parallel sequencing, targeted exon sequencing, hybridization capture, cancer, FFPE, DNA mutations
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
Automated High-throughput Behavioral Analyses in Zebrafish Larvae
Institutions: Brown University .
We have created a novel high-throughput imaging system for the analysis of behavior in 7-day-old zebrafish larvae in multi-lane plates. This system measures spontaneous behaviors and the response to an aversive stimulus, which is shown to the larvae via
a PowerPoint presentation. The recorded images are analyzed with an ImageJ macro, which automatically splits the color channels, subtracts the background, and applies a threshold to identify individual larvae placement in the lanes. We can then import the coordinates into an Excel sheet to quantify swim speed, preference for edge or side of the lane, resting behavior, thigmotaxis, distance between larvae, and avoidance behavior. Subtle changes in behavior are easily detected using our system, making it useful for behavioral analyses after exposure to environmental toxicants or pharmaceuticals.
Behavior, Issue 77, Neuroscience, Neurobiology, Developmental Biology, Cellular Biology, Molecular Biology, Biochemistry, Physiology, Anatomy, Toxicology, Behavioral Sciences, Zebrafish larvae, high-throughput assay, thigmotaxis, avoidance, behavior, automated analysis, Zebrafish, Danio rerio, animal model
Electricity-Free, Sequential Nucleic Acid and Protein Isolation
Institutions: CUBRC, Inc., State University of New York at Buffalo, School of Medicine and Biomedical Sciences.
Traditional and emerging pathogens such as Enterohemorrhagic Escherichia coli
(EHEC), Yersinia pestis,
or prion-based diseases are of significant concern for governments, industries and medical professionals worldwide. For example, EHECs, combined with Shigella
, are responsible for the deaths of approximately 325,000 children each year and are particularly prevalent in the developing world where laboratory-based identification, common in the United States, is unavailable 1
. The development and distribution of low cost, field-based, point-of-care tools to aid in the rapid identification and/or diagnosis of pathogens or disease markers could dramatically alter disease progression and patient prognosis. We have developed a tool to isolate nucleic acids and proteins from a sample by solid-phase extraction (SPE) without electricity or associated laboratory equipment 2
. The isolated macromolecules can be used for diagnosis either in a forward lab or using field-based point-of-care platforms. Importantly, this method provides for the direct comparison of nucleic acid and protein data from an un-split sample, offering a confidence through corroboration of genomic and proteomic analysis.
Our isolation tool utilizes the industry standard for solid-phase nucleic acid isolation, the BOOM technology, which isolates nucleic acids from a chaotropic salt solution, usually guanidine isothiocyanate, through binding to silica-based particles or filters 3
. CUBRC's proprietary solid-phase extraction chemistry is used to purify protein from chaotropic salt solutions, in this case, from the waste or flow-thru following nucleic acid isolation4
By packaging well-characterized chemistries into a small, inexpensive and simple platform, we have generated a portable system for nucleic acid and protein extraction that can be performed under a variety of conditions. The isolated nucleic acids are stable and can be transported to a position where power is available for PCR amplification while the protein content can immediately be analyzed by hand held or other immunological-based assays. The rapid identification of disease markers in the field could significantly alter the patient's outcome by directing the proper course of treatment at an earlier stage of disease progression. The tool and method described are suitable for use with virtually any infectious agent and offer the user the redundancy of multi-macromolecule type analyses while simultaneously reducing their logistical burden.
Chemistry, Issue 63, Solid phase extraction, nucleic acid, protein, isolation, silica, Guanidine thiocyanate, isopropanol, remote, DTRA
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation
Institutions: Agency for Science, Technology and Research, Singapore, A*STAR-Duke-NUS Neuroscience Research Partnership, Singapore, National University of Singapore, Singapore.
Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12
. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13
. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14
Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6
. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8
We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video.
Genetics, Issue 62, ChIP, ChIA-PET, Chromatin Interactions, Genomics, Next-Generation Sequencing
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
Polymerase Chain Reaction: Basic Protocol Plus Troubleshooting and Optimization Strategies
Institutions: University of California, Los Angeles .
In the biological sciences there have been technological advances that catapult the discipline into golden ages of discovery. For example, the field of microbiology was transformed with the advent of Anton van Leeuwenhoek's microscope, which allowed scientists to visualize prokaryotes for the first time. The development of the polymerase chain reaction (PCR) is one of those innovations that changed the course of molecular science with its impact spanning countless subdisciplines in biology. The theoretical process was outlined by Keppe and coworkers in 1971; however, it was another 14 years until the complete PCR procedure was described and experimentally applied by Kary Mullis while at Cetus Corporation in 1985. Automation and refinement of this technique progressed with the introduction of a thermal stable DNA polymerase from the bacterium Thermus aquaticus
, consequently the name Taq
PCR is a powerful amplification technique that can generate an ample supply of a specific segment of DNA (i.e., an amplicon) from only a small amount of starting material (i.e., DNA template or target sequence). While straightforward and generally trouble-free, there are pitfalls that complicate the reaction producing spurious results. When PCR fails it can lead to many non-specific DNA products of varying sizes that appear as a ladder or smear of bands on agarose gels. Sometimes no products form at all. Another potential problem occurs when mutations are unintentionally introduced in the amplicons, resulting in a heterogeneous population of PCR products. PCR failures can become frustrating unless patience and careful troubleshooting are employed to sort out and solve the problem(s). This protocol outlines the basic principles of PCR, provides a methodology that will result in amplification of most target sequences, and presents strategies for optimizing a reaction. By following this PCR guide, students should be able to:
● Set up reactions and thermal cycling conditions for a conventional PCR experiment
● Understand the function of various reaction components and their overall effect on a PCR experiment
● Design and optimize a PCR experiment for any DNA template
● Troubleshoot failed PCR experiments
Basic Protocols, Issue 63, PCR, optimization, primer design, melting temperature, Tm, troubleshooting, additives, enhancers, template DNA quantification, thermal cycler, molecular biology, genetics
An Allelotyping PCR for Identifying Salmonella enterica serovars Enteritidis, Hadar, Heidelberg, and Typhimurium
Institutions: University of Georgia.
Current commercial PCRs tests for identifying Salmonella
target genes unique to this genus. However, there are two species, six subspecies, and over 2,500 different Salmonella
serovars, and not all are equal in their significance to public health. For example, finding S. enterica subspecies
IIIa Arizona on a table egg layer farm is insignificant compared to the isolation of S. enterica
subspecies I serovar Enteritidis, the leading cause of salmonellosis linked to the consumption of table eggs. Serovars are identified based on antigenic differences in lipopolysaccharide (LPS)(O antigen) and flagellin (H1 and H2 antigens). These antigenic differences are the outward appearance of the diversity of genes and gene alleles associated with this phenotype.
We have developed an allelotyping, multiplex PCR that keys on genetic differences between four major S. enterica
subspecies I serovars found in poultry and associated with significant human disease in the US. The PCR primer pairs were targeted to key genes or sequences unique to a specific Salmonella
serovar and designed to produce an amplicon with size specific for that gene or allele. Salmonella
serovar is assigned to an isolate based on the combination of PCR test results for specific LPS and flagellin gene alleles. The multiplex PCRs described in this article are specific for the detection of S. enterica
subspecies I serovars Enteritidis, Hadar, Heidelberg, and Typhimurium.
Here we demonstrate how to use the multiplex PCRs to identify serovar for a Salmonella
Immunology, Issue 53, PCR, Salmonella, multiplex, Serovar
Identifying DNA Mutations in Purified Hematopoietic Stem/Progenitor Cells
Institutions: UT Health Science Center at San Antonio, UT Health Science Center at San Antonio, UT Health Science Center at San Antonio, UT Health Science Center at San Antonio, UT Health Science Center at San Antonio.
In recent years, it has become apparent that genomic instability is tightly related to many developmental disorders, cancers, and aging. Given that stem cells are responsible for ensuring tissue homeostasis and repair throughout life, it is reasonable to hypothesize that the stem cell population is critical for preserving genomic integrity of tissues. Therefore, significant interest has arisen in assessing the impact of endogenous and environmental factors on genomic integrity in stem cells and their progeny, aiming to understand the etiology of stem-cell based diseases.
transgenic mice carry a recoverable λ phage vector encoding the LacI
reporter system, in which the LacI
gene serves as the mutation reporter. The result of a mutated LacI
gene is the production of β-galactosidase that cleaves a chromogenic substrate, turning it blue. The LacI
reporter system is carried in all cells, including stem/progenitor cells and can easily be recovered and used to subsequently infect E. coli
. After incubating infected E. coli
on agarose that contains the correct substrate, plaques can be scored; blue plaques indicate a mutant LacI
gene, while clear plaques harbor wild-type. The frequency of blue (among clear) plaques indicates the mutant frequency in the original cell population the DNA was extracted from. Sequencing the mutant LacI
gene will show the location of the mutations in the gene and the type of mutation.
transgenic mouse model is well-established as an in vivo
mutagenesis assay. Moreover, the mice and the reagents for the assay are commercially available. Here we describe in detail how this model can be adapted to measure the frequency of spontaneously occurring DNA mutants in stem cell-enriched Lin-
(LSK) cells and other subpopulations of the hematopoietic system.
Infection, Issue 84, In vivo mutagenesis, hematopoietic stem/progenitor cells, LacI mouse model, DNA mutations, E. coli
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
Hydrogel Nanoparticle Harvesting of Plasma or Urine for Detecting Low Abundance Proteins
Institutions: George Mason University, Ceres Nanosciences.
Novel biomarker discovery plays a crucial role in providing more sensitive and specific disease detection. Unfortunately many low-abundance biomarkers that exist in biological fluids cannot be easily detected with mass spectrometry or immunoassays because they are present in very low concentration, are labile, and are often masked by high-abundance proteins such as albumin or immunoglobulin. Bait containing poly(N-isopropylacrylamide) (NIPAm) based nanoparticles are able to overcome these physiological barriers. In one step they are able to capture, concentrate and preserve biomarkers from body fluids. Low-molecular weight analytes enter the core of the nanoparticle and are captured by different organic chemical dyes, which act as high affinity protein baits. The nanoparticles are able to concentrate the proteins of interest by several orders of magnitude. This concentration factor is sufficient to increase the protein level such that the proteins are within the detection limit of current mass spectrometers, western blotting, and immunoassays. Nanoparticles can be incubated with a plethora of biological fluids and they are able to greatly enrich the concentration of low-molecular weight proteins and peptides while excluding albumin and other high-molecular weight proteins. Our data show that a 10,000 fold amplification in the concentration of a particular analyte can be achieved, enabling mass spectrometry and immunoassays to detect previously undetectable biomarkers.
Bioengineering, Issue 90, biomarker, hydrogel, low abundance, mass spectrometry, nanoparticle, plasma, protein, urine
An Affordable HIV-1 Drug Resistance Monitoring Method for Resource Limited Settings
Institutions: University of KwaZulu-Natal, Durban, South Africa, Jembi Health Systems, University of Amsterdam, Stanford Medical School.
HIV-1 drug resistance has the potential to seriously compromise the effectiveness and impact of antiretroviral therapy (ART). As ART programs in sub-Saharan Africa continue to expand, individuals on ART should be closely monitored for the emergence of drug resistance. Surveillance of transmitted drug resistance to track transmission of viral strains already resistant to ART is also critical. Unfortunately, drug resistance testing is still not readily accessible in resource limited settings, because genotyping is expensive and requires sophisticated laboratory and data management infrastructure. An open access genotypic drug resistance monitoring method to manage individuals and assess transmitted drug resistance is described. The method uses free open source software for the interpretation of drug resistance patterns and the generation of individual patient reports. The genotyping protocol has an amplification rate of greater than 95% for plasma samples with a viral load >1,000 HIV-1 RNA copies/ml. The sensitivity decreases significantly for viral loads <1,000 HIV-1 RNA copies/ml. The method described here was validated against a method of HIV-1 drug resistance testing approved by the United States Food and Drug Administration (FDA), the Viroseq genotyping method. Limitations of the method described here include the fact that it is not automated and that it also failed to amplify the circulating recombinant form CRF02_AG from a validation panel of samples, although it amplified subtypes A and B from the same panel.
Medicine, Issue 85, Biomedical Technology, HIV-1, HIV Infections, Viremia, Nucleic Acids, genetics, antiretroviral therapy, drug resistance, genotyping, affordable
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing