As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators.
To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (http://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
24 Related JoVE Articles!
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Pyrosequencing for Microbial Identification and Characterization
Institutions: Johns Hopkins University, Qiagen Sciences, Inc..
Pyrosequencing is a versatile technique that facilitates microbial genome sequencing that can be used to identify bacterial species, discriminate bacterial strains and detect genetic mutations that confer resistance to anti-microbial agents. The advantages of pyrosequencing for microbiology applications include rapid and reliable high-throughput screening and accurate identification of microbes and microbial genome mutations. Pyrosequencing involves sequencing of DNA by synthesizing the complementary strand a single base at a time, while determining the specific nucleotide being incorporated during the synthesis reaction. The reaction occurs on immobilized single stranded template DNA where the four deoxyribonucleotides (dNTP) are added sequentially and the unincorporated dNTPs are enzymatically degraded before addition of the next dNTP to the synthesis reaction. Detection of the specific base incorporated into the template is monitored by generation of chemiluminescent signals. The order of dNTPs that produce the chemiluminescent signals determines the DNA sequence of the template. The real-time sequencing capability of pyrosequencing technology enables rapid microbial identification in a single assay. In addition, the pyrosequencing instrument, can analyze the full genetic diversity of anti-microbial drug resistance, including typing of SNPs, point mutations, insertions, and deletions, as well as quantification of multiple gene copies that may occur in some anti-microbial resistance patterns.
Microbiology, Issue 78, Genetics, Molecular Biology, Basic Protocols, Genomics, Eukaryota, Bacteria, Viruses, Bacterial Infections and Mycoses, Virus Diseases, Diagnosis, Therapeutics, Equipment and Supplies, Technology, Industry, and Agriculture, Life Sciences (General), Pyrosequencing, DNA, Microbe, PCR, primers, Next-Generation, high-throughput, sequencing
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
High-throughput Physical Mapping of Chromosomes using Automated in situ Hybridization
Institutions: Virginia Tech.
Projects to obtain whole-genome sequences for 10,000 vertebrate species1
and for 5,000 insect and related arthropod species2
are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform3,4
. This Anopheles
species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo
. Moreover, the existing genome assemblies for Anopheles gambiae
, although obtained using the Sanger method, are gapped or fragmented4,6
Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations7,8
Here, we present innovative approaches to chromosome preparation, fluorescent in situ
hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae
as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila9
, allows the user to visualize more details on chromosomes than the regular squashing technique10
. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time11
. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH12
. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.
Genetics, Issue 64, Entomology, Molecular Biology, Genomics, automation, chromosome, genome, hybridization, labeling, mapping, mosquito
Genome Editing with CompoZr Custom Zinc Finger Nucleases (ZFNs)
Institutions: Sigma Life Science.
Genome editing is a powerful technique that can be used to elucidate gene function and the genetic basis of disease. Traditional gene editing methods such as chemical-based mutagenesis or random integration of DNA sequences confer indiscriminate genetic changes in an overall inefficient manner and require incorporation of undesirable synthetic sequences or use of aberrant culture conditions, potentially confusing biological study. By contrast, transient ZFN expression in a cell can facilitate precise, heritable gene editing in a highly efficient manner without the need for administration of chemicals or integration of synthetic transgenes.
Zinc finger nucleases (ZFNs) are enzymes which bind and cut distinct sequences of double-stranded DNA (dsDNA). A functional CompoZr ZFN unit consists of two individual monomeric proteins that bind a DNA "half-site" of approximately 15-18 nucleotides (see Figure 1
). When two ZFN monomers "home" to their adjacent target sites the DNA-cleavage domains dimerize and create a double-strand break (DSB) in the DNA.1
Introduction of ZFN-mediated DSBs in the genome lays a foundation for highly efficient genome editing. Imperfect repair of DSBs in a cell via the non-homologous end-joining (NHEJ) DNA repair pathway can result in small insertions and deletions (indels). Creation of indels within the gene coding sequence of a cell can result in frameshift and subsequent functional knockout of a gene locus at high efficiency.2
While this protocol describes the use of ZFNs to create a gene knockout, integration of transgenes may also be conducted via homology-directed repair at the ZFN cut site.
The CompoZr Custom ZFN Service represents a systematic, comprehensive, and well-characterized approach to targeted gene editing for the scientific community with ZFN technology. Sigma scientists work closely with investigators to 1) perform due diligence analysis including analysis of relevant gene structure, biology, and model system pursuant to the project goals, 2) apply this knowledge to develop a sound targeting strategy, 3) then design, build, and functionally validate ZFNs for activity in a relevant cell line. The investigator receives positive control genomic DNA and primers, and ready-to-use ZFN reagents supplied in both plasmid DNA and in-vitro transcribed mRNA format. These reagents may then be delivered for transient expression in the investigator’s cell line or cell type of choice. Samples are then tested for gene editing at the locus of interest by standard molecular biology techniques including PCR amplification, enzymatic digest, and electrophoresis. After positive signal for gene editing is detected in the initial population, cells are single-cell cloned and genotyped for identification of mutant clones/alleles.
Genetics, Issue 64, Molecular Biology, Zinc Finger Nuclease, Genome Engineering, Genomic Editing, Gene Modification, Gene Knockout, Gene Integration, non-homologous end joining, homologous recombination, targeted genome editing
Procedures for Identifying Infectious Prions After Passage Through the Digestive System of an Avian Species
Infectious prion (PrPRes
) material is likely the cause of fatal, neurodegenerative transmissible spongiform encephalopathy (TSE) diseases1
. Transmission of TSE diseases, such as chronic wasting disease (CWD), is presumed to be from animal to animal2,3
as well as from environmental sources4-6
. Scavengers and carnivores have potential to translocate PrPRes
material through consumption and excretion of CWD-contaminated carrion. Recent work has documented passage of PrPRes
material through the digestive system of American crows (Corvus brachyrhynchos
), a common North American scavenger7
We describe procedures used to document passage of PrPRes
material through American crows. Crows were gavaged with RML-strain mouse-adapted scrapie and their feces were collected 4 hr post gavage. Crow feces were then pooled and injected intraperitoneally into C57BL/6 mice. Mice were monitored daily until they expressed clinical signs of mouse scrapie and were thereafter euthanized. Asymptomatic mice were monitored until 365 days post inoculation. Western blot analysis was conducted to confirm disease status. Results revealed that prions remain infectious after traveling through the digestive system of crows and are present in the feces, causing disease in test mice.
Infection, Issue 81, American crows, feces, mouse model, prion detection, PrPRes, scrapie, TSE transmission
Barnes Maze Testing Strategies with Small and Large Rodent Models
Institutions: University of Missouri, Food and Drug Administration.
Spatial learning and memory of laboratory rodents is often assessed via navigational ability in mazes, most popular of which are the water and dry-land (Barnes) mazes. Improved performance over sessions or trials is thought to reflect learning and memory of the escape cage/platform location. Considered less stressful than water mazes, the Barnes maze is a relatively simple design of a circular platform top with several holes equally spaced around the perimeter edge. All but one of the holes are false-bottomed or blind-ending, while one leads to an escape cage. Mildly aversive stimuli (e.g.
bright overhead lights) provide motivation to locate the escape cage. Latency to locate the escape cage can be measured during the session; however, additional endpoints typically require video recording. From those video recordings, use of automated tracking software can generate a variety of endpoints that are similar to those produced in water mazes (e.g.
distance traveled, velocity/speed, time spent in the correct quadrant, time spent moving/resting, and confirmation of latency). Type of search strategy (i.e.
random, serial, or direct) can be categorized as well. Barnes maze construction and testing methodologies can differ for small rodents, such as mice, and large rodents, such as rats. For example, while extra-maze cues are effective for rats, smaller wild rodents may require intra-maze cues with a visual barrier around the maze. Appropriate stimuli must be identified which motivate the rodent to locate the escape cage. Both Barnes and water mazes can be time consuming as 4-7 test trials are typically required to detect improved learning and memory performance (e.g.
shorter latencies or path lengths to locate the escape platform or cage) and/or differences between experimental groups. Even so, the Barnes maze is a widely employed behavioral assessment measuring spatial navigational abilities and their potential disruption by genetic, neurobehavioral manipulations, or drug/ toxicant exposure.
Behavior, Issue 84, spatial navigation, rats, Peromyscus, mice, intra- and extra-maze cues, learning, memory, latency, search strategy, escape motivation
Transabdominal Ultrasound for Pregnancy Diagnosis in Reeves' Muntjac Deer
Institutions: Colorado State University.
Reeves' muntjac deer (Muntiacus reevesi
) are a small cervid species native to southeast Asia, and are currently being investigated as a potential model of prion disease transmission and pathogenesis. Vertical transmission is an area of interest among researchers studying infectious diseases, including prion disease, and these investigations require efficient methods for evaluating the effects of maternal infection on reproductive performance. Ultrasonographic examination is a well-established tool for diagnosing pregnancy and assessing fetal health in many animal species1-7
, including several species of farmed cervids8-19
, however this technique has not been described in Reeves' muntjac deer. Here we describe the application of transabdominal ultrasound to detect pregnancy in muntjac does and to evaluate fetal growth and development throughout the gestational period. Using this procedure, pregnant animals were identified as early as 35 days following doe-buck pairing and this was an effective means to safely monitor the pregnancy at regular intervals. Future goals of this work will include establishing normal fetal measurement references for estimation of gestational age, determining sensitivity and specificity of the technique for diagnosing pregnancy at various stages of gestation, and identifying variations in fetal growth and development under different experimental conditions.
Medicine, Issue 83, Ultrasound, Reeves' muntjac deer, Muntiacus reevesi, fetal development, fetal growth, captive cervids
Preparation of Primary Myogenic Precursor Cell/Myoblast Cultures from Basal Vertebrate Lineages
Institutions: University of Alabama at Birmingham, INRA UR1067, INRA UR1037.
Due to the inherent difficulty and time involved with studying the myogenic program in vivo
, primary culture systems derived from the resident adult stem cells of skeletal muscle, the myogenic precursor cells (MPCs), have proven indispensible to our understanding of mammalian skeletal muscle development and growth. Particularly among the basal taxa of Vertebrata,
however, data are limited describing the molecular mechanisms controlling the self-renewal, proliferation, and differentiation of MPCs. Of particular interest are potential mechanisms that underlie the ability of basal vertebrates to undergo considerable postlarval skeletal myofiber hyperplasia (i.e.
teleost fish) and full regeneration following appendage loss (i.e.
urodele amphibians). Additionally, the use of cultured myoblasts could aid in the understanding of regeneration and the recapitulation of the myogenic program and the differences between them. To this end, we describe in detail a robust and efficient protocol (and variations therein) for isolating and maintaining MPCs and their progeny, myoblasts and immature myotubes, in cell culture as a platform for understanding the evolution of the myogenic program, beginning with the more basal vertebrates. Capitalizing on the model organism status of the zebrafish (Danio rerio
), we report on the application of this protocol to small fishes of the cyprinid clade Danioninae
. In tandem, this protocol can be utilized to realize a broader comparative approach by isolating MPCs from the Mexican axolotl (Ambystomamexicanum
) and even laboratory rodents. This protocol is now widely used in studying myogenesis in several fish species, including rainbow trout, salmon, and sea bream1-4
Basic Protocol, Issue 86, myogenesis, zebrafish, myoblast, cell culture, giant danio, moustached danio, myotubes, proliferation, differentiation, Danioninae, axolotl
DNA-affinity-purified Chip (DAP-chip) Method to Determine Gene Targets for Bacterial Two component Regulatory Systems
Institutions: Lawrence Berkeley National Laboratory.
methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro
microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris
Genetics, Issue 89, DNA-Affinity-Purified-chip, response regulator, transcription factor binding site, two component system, signal transduction, Desulfovibrio, lactate utilization regulator, ChIP-chip
Getting to Compliance in Forced Exercise in Rodents: A Critical Standard to Evaluate Exercise Impact in Aging-related Disorders and Disease
Institutions: Louisiana State University Health Sciences Center.
There is a major increase in the awareness of the positive impact of exercise on improving several disease states with neurobiological basis; these include improving cognitive function and physical performance. As a result, there is an increase in the number of animal studies employing exercise. It is argued that one intrinsic value of forced exercise is that the investigator has control over the factors that can influence the impact of exercise on behavioral outcomes, notably exercise frequency, duration, and intensity of the exercise regimen. However, compliance in forced exercise regimens may be an issue, particularly if potential confounds of employing foot-shock are to be avoided. It is also important to consider that since most cognitive and locomotor impairments strike in the aged individual, determining impact of exercise on these impairments should consider using aged rodents with a highest possible level of compliance to ensure minimal need for test subjects. Here, the pertinent steps and considerations necessary to achieve nearly 100% compliance to treadmill exercise in an aged rodent model will be presented and discussed. Notwithstanding the particular exercise regimen being employed by the investigator, our protocol should be of use to investigators that are particularly interested in the potential impact of forced exercise on aging-related impairments, including aging-related Parkinsonism and Parkinson’s disease.
Behavior, Issue 90, Exercise, locomotor, Parkinson’s disease, aging, treadmill, bradykinesia, Parkinsonism
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Infinium Assay for Large-scale SNP Genotyping Applications
Institutions: Oklahoma Medical Research Foundation.
Genotyping variants in the human genome has proven to be an efficient method to identify genetic associations with phenotypes. The distribution of variants within families or populations can facilitate identification of the genetic factors of disease. Illumina's panel of genotyping BeadChips allows investigators to genotype thousands or millions of single nucleotide polymorphisms (SNPs) or to analyze other genomic variants, such as copy number, across a large number of DNA samples. These SNPs can be spread throughout the genome or targeted in specific regions in order to maximize potential discovery. The Infinium assay has been optimized to yield high-quality, accurate results quickly. With proper setup, a single technician can process from a few hundred to over a thousand DNA samples per week, depending on the type of array. This assay guides users through every step, starting with genomic DNA and ending with the scanning of the array. Using propriety reagents, samples are amplified, fragmented, precipitated, resuspended, hybridized to the chip, extended by a single base, stained, and scanned on either an iScan or Hi Scan high-resolution optical imaging system. One overnight step is required to amplify the DNA. The DNA is denatured and isothermally amplified by whole-genome amplification; therefore, no PCR is required. Samples are hybridized to the arrays during a second overnight step. By the third day, the samples are ready to be scanned and analyzed. Amplified DNA may be stockpiled in large quantities, allowing bead arrays to be processed every day of the week, thereby maximizing throughput.
Basic Protocol, Issue 81, genomics, SNP, Genotyping, Infinium, iScan, HiScan, Illumina
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
Mouse Genome Engineering Using Designer Nucleases
Institutions: University of Zurich, University of Minnesota.
Transgenic mice carrying site-specific genome modifications (knockout, knock-in) are of vital importance for dissecting complex biological systems as well as for modeling human diseases and testing therapeutic strategies. Recent advances in the use of designer nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system for site-specific genome engineering open the possibility to perform rapid targeted genome modification in virtually any laboratory species without the need to rely on embryonic stem (ES) cell technology. A genome editing experiment typically starts with identification of designer nuclease target sites within a gene of interest followed by construction of custom DNA-binding domains to direct nuclease activity to the investigator-defined genomic locus. Designer nuclease plasmids are in vitro
transcribed to generate mRNA for microinjection of fertilized mouse oocytes. Here, we provide a protocol for achieving targeted genome modification by direct injection of TALEN mRNA into fertilized mouse oocytes.
Genetics, Issue 86, Oocyte microinjection, Designer nucleases, ZFN, TALEN, Genome Engineering
Detecting Somatic Genetic Alterations in Tumor Specimens by Exon Capture and Massively Parallel Sequencing
Institutions: Memorial Sloan-Kettering Cancer Center, Memorial Sloan-Kettering Cancer Center.
Efforts to detect and investigate key oncogenic mutations have proven valuable to facilitate the appropriate treatment for cancer patients. The establishment of high-throughput, massively parallel "next-generation" sequencing has aided the discovery of many such mutations. To enhance the clinical and translational utility of this technology, platforms must be high-throughput, cost-effective, and compatible with formalin-fixed paraffin embedded (FFPE) tissue samples that may yield small amounts of degraded or damaged DNA. Here, we describe the preparation of barcoded and multiplexed DNA libraries followed by hybridization-based capture of targeted exons for the detection of cancer-associated mutations in fresh frozen and FFPE tumors by massively parallel sequencing. This method enables the identification of sequence mutations, copy number alterations, and select structural rearrangements involving all targeted genes. Targeted exon sequencing offers the benefits of high throughput, low cost, and deep sequence coverage, thus conferring high sensitivity for detecting low frequency mutations.
Molecular Biology, Issue 80, Molecular Diagnostic Techniques, High-Throughput Nucleotide Sequencing, Genetics, Neoplasms, Diagnosis, Massively parallel sequencing, targeted exon sequencing, hybridization capture, cancer, FFPE, DNA mutations
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
2D and 3D Chromosome Painting in Malaria Mosquitoes
Institutions: Virginia Tech.
Fluorescent in situ
hybridization (FISH) of whole arm chromosome probes is a robust technique for mapping genomic regions of interest, detecting chromosomal rearrangements, and studying three-dimensional (3D) organization of chromosomes in the cell nucleus. The advent of laser capture microdissection (LCM) and whole genome amplification (WGA) allows obtaining large quantities of DNA from single cells. The increased sensitivity of WGA kits prompted us to develop chromosome paints and to use them for exploring chromosome organization and evolution in non-model organisms. Here, we present a simple method for isolating and amplifying the euchromatic segments of single polytene chromosome arms from ovarian nurse cells of the African malaria mosquito Anopheles gambiae
. This procedure provides an efficient platform for obtaining chromosome paints, while reducing the overall risk of introducing foreign DNA to the sample. The use of WGA allows for several rounds of re-amplification, resulting in high quantities of DNA that can be utilized for multiple experiments, including 2D and 3D FISH. We demonstrated that the developed chromosome paints can be successfully used to establish the correspondence between euchromatic portions of polytene and mitotic chromosome arms in An. gambiae
. Overall, the union of LCM and single-chromosome WGA provides an efficient tool for creating significant amounts of target DNA for future cytogenetic and genomic studies.
Immunology, Issue 83, Microdissection, whole genome amplification, malaria mosquito, polytene chromosome, mitotic chromosomes, fluorescence in situ hybridization, chromosome painting
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Pyrosequencing: A Simple Method for Accurate Genotyping
Institutions: Washington University in St. Louis.
Pharmacogenetic research benefits first-hand from the abundance of information provided by the completion of the Human Genome Project. With such a tremendous amount of data available comes an explosion of genotyping methods. Pyrosequencing(R) is one of the most thorough yet simple methods to date used to analyze polymorphisms. It also has the ability to identify tri-allelic, indels, short-repeat polymorphisms, along with determining allele percentages for methylation or pooled sample assessment. In addition, there is a standardized control sequence that provides internal quality control. This method has led to rapid and efficient single-nucleotide polymorphism evaluation including many clinically relevant polymorphisms. The technique and methodology of Pyrosequencing is explained.
Cellular Biology, Issue 11, Springer Protocols, Pyrosequencing, genotype, polymorphism, SNP, pharmacogenetics, pharmacogenomics, PCR
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution