As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators.
To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (http://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
23 Related JoVE Articles!
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions
Institutions: Institut Pasteur , Université Sorbonne Paris Cité, Dana Farber Cancer Institute.
Significant efforts were gathered to generate large-scale comprehensive protein-protein interaction network maps. This is instrumental to understand the pathogen-host relationships and was essentially performed by genetic screenings in yeast two-hybrid systems. The recent improvement of protein-protein interaction detection by a Gaussia
luciferase-based fragment complementation assay now offers the opportunity to develop integrative comparative interactomic approaches necessary to rigorously compare interaction profiles of proteins from different pathogen strain variants against a common set of cellular factors.
This paper specifically focuses on the utility of combining two orthogonal methods to generate protein-protein interaction datasets: yeast two-hybrid (Y2H) and a new assay, high-throughput Gaussia princeps
protein complementation assay (HT-GPCA) performed in mammalian cells.
A large-scale identification of cellular partners of a pathogen protein is performed by mating-based yeast two-hybrid screenings of cDNA libraries using multiple pathogen strain variants. A subset of interacting partners selected on a high-confidence statistical scoring is further validated in mammalian cells for pair-wise interactions with the whole set of pathogen variants proteins using HT-GPCA. This combination of two complementary methods improves the robustness of the interaction dataset, and allows the performance of a stringent comparative interaction analysis. Such comparative interactomics constitute a reliable and powerful strategy to decipher any pathogen-host interplays.
Immunology, Issue 77, Genetics, Microbiology, Biochemistry, Molecular Biology, Cellular Biology, Biomedical Engineering, Infection, Cancer Biology, Virology, Medicine, Host-Pathogen Interactions, Host-Pathogen Interactions, Protein-protein interaction, High-throughput screening, Luminescence, Yeast two-hybrid, HT-GPCA, Network, protein, yeast, cell, culture
Experimental Manipulation of Body Size to Estimate Morphological Scaling Relationships in Drosophila
Institutions: University of Houston, Michigan State University.
The scaling of body parts is a central feature of animal morphology1-7
. Within species, morphological traits need to be correctly proportioned to the body for the organism to function; larger individuals typically have larger body parts and smaller individuals generally have smaller body parts, such that overall body shape is maintained across a range of adult body sizes. The requirement for correct proportions means that individuals within species usually exhibit low variation in relative trait size. In contrast, relative trait size can vary dramatically among species and is a primary mechanism by which morphological diversity is produced. Over a century of comparative work has established these intra- and interspecific patterns3,4
Perhaps the most widely used approach to describe this variation is to calculate the scaling relationship between the size of two morphological traits using the allometric equation y=bxα, where x and y are the size of the two traits, such as organ and body size8,9
. This equation describes the within-group (e.g., species, population) scaling relationship between two traits as both vary in size. Log-transformation of this equation produces a simple linear equation, log(y) = log(b) + αlog(x) and log-log plots of the size of different traits among individuals of the same species typically reveal linear scaling with an intercept of log(b) and a slope of α, called the 'allometric coefficient'9,10
. Morphological variation among groups is described by differences in scaling relationship intercepts or slopes for a given trait pair. Consequently, variation in the parameters of the allometric equation (b and α) elegantly describes the shape variation captured in the relationship between organ and body size within and among biological groups (see 11,12
Not all traits scale linearly with each other or with body size (e.g., 13,14
) Hence, morphological scaling relationships are most informative when the data are taken from the full range of trait sizes. Here we describe how simple experimental manipulation of diet can be used to produce the full range of body size in insects. This permits an estimation of the full scaling relationship for any given pair of traits, allowing a complete description of how shape covaries with size and a robust comparison of scaling relationship parameters among biological groups. Although we focus on Drosophila
, our methodology should be applicable to nearly any fully metamorphic insect.
Developmental Biology, Issue 56, Drosophila, allometry, morphology, body size, scaling, insect
ScanLag: High-throughput Quantification of Colony Growth and Lag Time
Institutions: The Hebrew University of Jerusalem.
Growth dynamics are fundamental characteristics of microorganisms. Quantifying growth precisely is an important goal in microbiology. Growth dynamics are affected both by the doubling time of the microorganism and by any delay in growth upon transfer from one condition to another, the lag. The ScanLag method enables the characterization of these two independent properties at the level of colonies originating each from a single cell, generating a two-dimensional distribution of the lag time and of the growth time. In ScanLag, measurement of the time it takes for colonies on conventional nutrient agar plates to be detected is automated on an array of commercial scanners controlled by an in house application. Petri dishes are placed on the scanners, and the application acquires images periodically. Automated analysis of colony growth is then done by an application that returns the appearance time and growth rate of each colony. Other parameters, such as the shape, texture and color of the colony, can be extracted for multidimensional mapping of sub-populations of cells. Finally, the method enables the retrieval of rare variants with specific growth phenotypes for further characterization. The technique could be applied in bacteriology for the identification of long lag that can cause persistence to antibiotics, as well as a general low cost technique for phenotypic screens.
Immunology, Issue 89, lag, growth rate, growth delay, single cell, scanners, image analysis, persistence, resistance, rare mutants, phenotypic screens, phenomics
Identifying the Effects of BRCA1 Mutations on Homologous Recombination using Cells that Express Endogenous Wild-type BRCA1
Institutions: The Ohio State University, Tohoku University.
The functional analysis of missense mutations can be complicated by the presence in the cell of the endogenous protein. Structure-function analyses of the BRCA1 have been complicated by the lack of a robust assay for the full length BRCA1 protein and the difficulties inherent in working with cell lines that express hypomorphic BRCA1 protein1,2,3,4,5
. We developed a system whereby the endogenous BRCA1 protein in a cell was acutely depleted by RNAi targeting the 3'-UTR of the BRCA1 mRNA and replaced by co-transfecting a plasmid expressing a BRCA1 variant. One advantage of this procedure is that the acute silencing of BRCA1 and simultaneous replacement allow the cells to grow without secondary mutations or adaptations that might arise over time to compensate for the loss of BRCA1 function. This depletion and add-back procedure was done in a HeLa-derived cell line that was readily assayed for homologous recombination activity. The homologous recombination assay is based on a previously published method whereby a recombination substrate is integrated into the genome (Figure 1)6,7,8,9
. This recombination substrate has the rare-cutting I-SceI restriction enzyme site inside an inactive GFP allele, and downstream is a second inactive GFP allele. Transfection of the plasmid that expresses I-SceI results in a double-stranded break, which may be repaired by homologous recombination, and if homologous recombination does repair the break it creates an active GFP allele that is readily scored by flow cytometry for GFP protein expression. Depletion of endogenous BRCA1 resulted in an 8-10-fold reduction in homologous recombination activity, and add-back of wild-type plasmid fully restored homologous recombination function. When specific point mutants of full length BRCA1 were expressed from co-transfected plasmids, the effect of the specific missense mutant could be scored. As an example, the expression of the BRCA1(M18T) protein, a variant of unknown clinical significance10
, was expressed in these cells, it failed to restore BRCA1-dependent homologous recombination. By contrast, expression of another variant, also of unknown significance, BRCA1(I21V) fully restored BRCA1-dependent homologous recombination function. This strategy of testing the function of BRCA1 missense mutations has been applied to another biological system assaying for centrosome function (Kais et al, unpublished observations). Overall, this approach is suitable for the analysis of missense mutants in any gene that must be analyzed recessively.
Cell Biology, Issue 48, BRCA1, homologous recombination, breast cancer, RNA interference, DNA repair
Infinium Assay for Large-scale SNP Genotyping Applications
Institutions: Oklahoma Medical Research Foundation.
Genotyping variants in the human genome has proven to be an efficient method to identify genetic associations with phenotypes. The distribution of variants within families or populations can facilitate identification of the genetic factors of disease. Illumina's panel of genotyping BeadChips allows investigators to genotype thousands or millions of single nucleotide polymorphisms (SNPs) or to analyze other genomic variants, such as copy number, across a large number of DNA samples. These SNPs can be spread throughout the genome or targeted in specific regions in order to maximize potential discovery. The Infinium assay has been optimized to yield high-quality, accurate results quickly. With proper setup, a single technician can process from a few hundred to over a thousand DNA samples per week, depending on the type of array. This assay guides users through every step, starting with genomic DNA and ending with the scanning of the array. Using propriety reagents, samples are amplified, fragmented, precipitated, resuspended, hybridized to the chip, extended by a single base, stained, and scanned on either an iScan or Hi Scan high-resolution optical imaging system. One overnight step is required to amplify the DNA. The DNA is denatured and isothermally amplified by whole-genome amplification; therefore, no PCR is required. Samples are hybridized to the arrays during a second overnight step. By the third day, the samples are ready to be scanned and analyzed. Amplified DNA may be stockpiled in large quantities, allowing bead arrays to be processed every day of the week, thereby maximizing throughput.
Basic Protocol, Issue 81, genomics, SNP, Genotyping, Infinium, iScan, HiScan, Illumina
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
The ChroP Approach Combines ChIP and Mass Spectrometry to Dissect Locus-specific Proteomic Landscapes of Chromatin
Institutions: European Institute of Oncology.
Chromatin is a highly dynamic nucleoprotein complex made of DNA and proteins that controls various DNA-dependent processes. Chromatin structure and function at specific regions is regulated by the local enrichment of histone post-translational modifications (hPTMs) and variants, chromatin-binding proteins, including transcription factors, and DNA methylation. The proteomic characterization of chromatin composition at distinct functional regions has been so far hampered by the lack of efficient protocols to enrich such domains at the appropriate purity and amount for the subsequent in-depth analysis by Mass Spectrometry (MS). We describe here a newly designed chromatin proteomics strategy, named ChroP (Chromatin Proteomics
), whereby a preparative chromatin immunoprecipitation is used to isolate distinct chromatin regions whose features, in terms of hPTMs, variants and co-associated non-histonic proteins, are analyzed by MS. We illustrate here the setting up of ChroP for the enrichment and analysis of transcriptionally silent heterochromatic regions, marked by the presence of tri-methylation of lysine 9 on histone H3. The results achieved demonstrate the potential of ChroP
in thoroughly characterizing the heterochromatin proteome and prove it as a powerful analytical strategy for understanding how the distinct protein determinants of chromatin interact and synergize to establish locus-specific structural and functional configurations.
Biochemistry, Issue 86, chromatin, histone post-translational modifications (hPTMs), epigenetics, mass spectrometry, proteomics, SILAC, chromatin immunoprecipitation , histone variants, chromatome, hPTMs cross-talks
Human Neutrophil Flow Chamber Adhesion Assay
Institutions: University of Alabama at Birmingham, Birmingham Veterans Affairs Medical Center, University of Alabama at Birmingham, University of Alabama at Birmingham, University of Alabama at Birmingham.
Neutrophil firm adhesion to endothelial cells plays a critical role in inflammation in both health and disease. The process of neutrophil firm adhesion involves many different adhesion molecules including members of the β2
integrin family and their counter-receptors of the ICAM family. Recently, naturally occurring genetic variants in both β2
integrins and ICAMs are reported to be associated with autoimmune disease. Thus, the quantitative adhesive capacity of neutrophils from individuals with varying allelic forms of these adhesion molecules is important to study in relation to mechanisms underlying development of autoimmunity. Adhesion studies in flow chamber systems can create an environment with fluid shear stress similar to that observed in the blood vessel environment in vivo
. Here, we present a method using a flow chamber assay system to study the quantitative adhesive properties of human peripheral blood neutrophils to human umbilical vein endothelial cell (HUVEC) and to purified ligand substrates. With this method, the neutrophil adhesive capacities from donors with different allelic variants in adhesion receptors can be assessed and compared. This method can also be modified to assess adhesion of other primary cell types or cell lines.
Immunology, Issue 89, neutrophil adhesion, flow chamber, human umbilical vein endothelial cell (HUVEC), purified ligand
In Vivo Modeling of the Morbid Human Genome using Danio rerio
Institutions: Duke University Medical Center, Duke University, Duke University Medical Center.
Here, we present methods for the development of assays to query potentially clinically significant nonsynonymous changes using in vivo
complementation in zebrafish. Zebrafish (Danio rerio
) are a useful animal system due to their experimental tractability; embryos are transparent to enable facile viewing, undergo rapid development ex vivo,
and can be genetically manipulated.1
These aspects have allowed for significant advances in the analysis of embryogenesis, molecular processes, and morphogenetic signaling. Taken together, the advantages of this vertebrate model make zebrafish highly amenable to modeling the developmental defects in pediatric disease, and in some cases, adult-onset disorders. Because the zebrafish genome is highly conserved with that of humans (~70% orthologous), it is possible to recapitulate human disease states in zebrafish. This is accomplished either through the injection of mutant human mRNA to induce dominant negative or gain of function alleles, or utilization of morpholino (MO) antisense oligonucleotides to suppress genes to mimic loss of function variants. Through complementation of MO-induced phenotypes with capped human mRNA, our approach enables the interpretation of the deleterious effect of mutations on human protein sequence based on the ability of mutant mRNA to rescue a measurable, physiologically relevant phenotype. Modeling of the human disease alleles occurs through microinjection of zebrafish embryos with MO and/or human mRNA at the 1-4 cell stage, and phenotyping up to seven days post fertilization (dpf). This general strategy can be extended to a wide range of disease phenotypes, as demonstrated in the following protocol. We present our established models for morphogenetic signaling, craniofacial, cardiac, vascular integrity, renal function, and skeletal muscle disorder phenotypes, as well as others.
Molecular Biology, Issue 78, Genetics, Biomedical Engineering, Medicine, Developmental Biology, Biochemistry, Anatomy, Physiology, Bioengineering, Genomics, Medical, zebrafish, in vivo, morpholino, human disease modeling, transcription, PCR, mRNA, DNA, Danio rerio, animal model
An Allele-specific Gene Expression Assay to Test the Functional Basis of Genetic Associations
Institutions: University of Oxford.
The number of significant genetic associations with common complex traits is constantly increasing. However, most of these associations have not been understood at molecular level. One of the mechanisms mediating the effect of DNA variants on phenotypes is gene expression, which has been shown to be particularly relevant for complex traits1
This method tests in a cellular context the effect of specific DNA sequences on gene expression. The principle is to measure the relative abundance of transcripts arising from the two alleles of a gene, analysing cells which carry one copy of the DNA sequences associated with disease (the risk variants)2,3
. Therefore, the cells used for this method should meet two fundamental genotypic requirements: they have to be heterozygous both for DNA risk variants and for DNA markers, typically coding polymorphisms, which can distinguish transcripts based on their chromosomal origin (Figure 1). DNA risk variants and DNA markers do not need to have the same allele frequency but the phase (haplotypic) relationship of the genetic markers needs to be understood. It is also important to choose cell types which express the gene of interest. This protocol refers specifically to the procedure adopted to extract nucleic acids from fibroblasts but the method is equally applicable to other cells types including primary cells.
DNA and RNA are extracted from the selected cell lines and cDNA is generated. DNA and cDNA are analysed with a primer extension assay, designed to target the coding DNA markers4
. The primer extension assay is carried out using the MassARRAY (Sequenom)5
platform according to the manufacturer's specifications. Primer extension products are then analysed by matrix-assisted laser desorption/ionization time of-flight mass spectrometry (MALDI-TOF/MS). Because the selected markers are heterozygous they will generate two peaks on the MS profiles. The area of each peak is proportional to the transcript abundance and can be measured with a function of the MassARRAY Typer software to generate an allelic ratio (allele 1: allele 2) calculation. The allelic ratio obtained for cDNA is normalized using that measured from genomic DNA, where the allelic ratio is expected to be 1:1 to correct for technical artifacts. Markers with a normalised allelic ratio significantly different to 1 indicate that the amount of transcript generated from the two chromosomes in the same cell is different, suggesting that the DNA variants associated with the phenotype have an effect on gene expression. Experimental controls should be used to confirm the results.
Cellular Biology, Issue 45, Gene expression, regulatory variant, haplotype, association study, primer extension, MALDI-TOF mass spectrometry, single nucleotide polymorphism, allele-specific
The Importance of Correct Protein Concentration for Kinetics and Affinity Determination in Structure-function Analysis
Institutions: GE Healthcare Bio-Sciences AB.
In this study, we explore the interaction between the bovine cysteine protease inhibitor cystatin B and a catalytically inactive form of papain (Fig. 1), a plant cysteine protease, by real-time label-free analysis using Biacore X100. Several cystatin B variants with point mutations in areas of interaction with papain, are produced. For each cystatin B variant we determine its specific binding concentration using calibration-free concentration analysis (CFCA) and compare the values obtained with total protein concentration as determined by A280
. After that, the kinetics of each cystatin B variant binding to papain is measured using single-cycle kinetics (SCK). We show that one of the four cystatin B variants we examine is only partially active for binding. This partial activity, revealed by CFCA, translates to a significant difference in the association rate constant (ka
) and affinity (KD
), compared to the values calculated using total protein concentration. Using CFCA in combination with kinetic analysis in a structure-function study contributes to obtaining reliable results, and helps to make the right interpretation of the interaction mechanism.
Cellular Biology, Issue 37, Protein interaction, Surface Plasmon Resonance, Biacore X100, CFCA, Cystatin B, Papain
gDNA Enrichment by a Transposase-based Technology for NGS Analysis of the Whole Sequence of BRCA1, BRCA2, and 9 Genes Involved in DNA Damage Repair
Institutions: Centre Georges-François Leclerc.
The widespread use of Next Generation Sequencing has opened up new avenues for cancer research and diagnosis. NGS will bring huge amounts of new data on cancer, and especially cancer genetics. Current knowledge and future discoveries will make it necessary to study a huge number of genes that could be involved in a genetic predisposition to cancer. In this regard, we developed a Nextera design to study 11 complete genes involved in DNA damage repair. This protocol was developed to safely study 11 genes (ATM
, and TP53
) from promoter to 3'-UTR in 24 patients simultaneously. This protocol, based on transposase technology and gDNA enrichment, gives a great advantage in terms of time for the genetic diagnosis thanks to sample multiplexing. This protocol can be safely used with blood gDNA.
Genetics, Issue 92, gDNA enrichment, Nextera, NGS, DNA damage, BRCA1, BRCA2
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Generation of High Quality Chromatin Immunoprecipitation DNA Template for High-throughput Sequencing (ChIP-seq)
Institutions: Children's Hospital of Philadelphia Research Institute, University of Pennsylvania .
ChIP-sequencing (ChIP-seq) methods directly offer whole-genome coverage, where combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing can be utilized to identify the repertoire of mammalian DNA sequences bound by transcription factors in vivo
. "Next-generation" genome sequencing technologies provide 1-2 orders of magnitude increase in the amount of sequence that can be cost-effectively generated over older technologies thus allowing for ChIP-seq methods to directly provide whole-genome coverage for effective profiling of mammalian protein-DNA interactions.
For successful ChIP-seq approaches, one must generate high quality ChIP DNA template to obtain the best sequencing outcomes. The description is based around experience with the protein product of the gene most strongly implicated in the pathogenesis of type 2 diabetes, namely the transcription factor transcription factor 7-like 2 (TCF7L2). This factor has also been implicated in various cancers.
Outlined is how to generate high quality ChIP DNA template derived from the colorectal carcinoma cell line, HCT116, in order to build a high-resolution map through sequencing to determine the genes bound by TCF7L2, giving further insight in to its key role in the pathogenesis of complex traits.
Molecular Biology, Issue 74, Genetics, Biochemistry, Microbiology, Medicine, Proteins, DNA-Binding Proteins, Transcription Factors, Chromatin Immunoprecipitation, Genes, chromatin, immunoprecipitation, ChIP, DNA, PCR, sequencing, antibody, cross-link, cell culture, assay
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
Simultaneous Multicolor Imaging of Biological Structures with Fluorescence Photoactivation Localization Microscopy
Institutions: University of Maine.
Localization-based super resolution microscopy can be applied to obtain a spatial map (image) of the distribution of individual fluorescently labeled single molecules within a sample with a spatial resolution of tens of nanometers. Using either photoactivatable (PAFP) or photoswitchable (PSFP) fluorescent proteins fused to proteins of interest, or organic dyes conjugated to antibodies or other molecules of interest, fluorescence photoactivation localization microscopy (FPALM) can simultaneously image multiple species of molecules within single cells. By using the following approach, populations of large numbers (thousands to hundreds of thousands) of individual molecules are imaged in single cells and localized with a precision of ~10-30 nm. Data obtained can be applied to understanding the nanoscale spatial distributions of multiple protein types within a cell. One primary advantage of this technique is the dramatic increase in spatial resolution: while diffraction limits resolution to ~200-250 nm in conventional light microscopy, FPALM can image length scales more than an order of magnitude smaller. As many biological hypotheses concern the spatial relationships among different biomolecules, the improved resolution of FPALM can provide insight into questions of cellular organization which have previously been inaccessible to conventional fluorescence microscopy. In addition to detailing the methods for sample preparation and data acquisition, we here describe the optical setup for FPALM. One additional consideration for researchers wishing to do super-resolution microscopy is cost: in-house setups are significantly cheaper than most commercially available imaging machines. Limitations of this technique include the need for optimizing the labeling of molecules of interest within cell samples, and the need for post-processing software to visualize results. We here describe the use of PAFP and PSFP expression to image two protein species in fixed cells. Extension of the technique to living cells is also described.
Basic Protocol, Issue 82, Microscopy, Super-resolution imaging, Multicolor, single molecule, FPALM, Localization microscopy, fluorescent proteins
Analysis of Tubular Membrane Networks in Cardiac Myocytes from Atria and Ventricles
Institutions: Heart Research Center Goettingen, University Medical Center Goettingen, German Center for Cardiovascular Research (DZHK) partner site Goettingen, University of Maryland School of Medicine.
In cardiac myocytes a complex network of membrane tubules - the transverse-axial tubule system (TATS) - controls deep intracellular signaling functions. While the outer surface membrane and associated TATS membrane components appear to be continuous, there are substantial differences in lipid and protein content. In ventricular myocytes (VMs), certain TATS components are highly abundant contributing to rectilinear tubule networks and regular branching 3D architectures. It is thought that peripheral TATS components propagate action potentials from the cell surface to thousands of remote intracellular sarcoendoplasmic reticulum (SER) membrane contact domains, thereby activating intracellular Ca2+
release units (CRUs). In contrast to VMs, the organization and functional role of TATS membranes in atrial myocytes (AMs) is significantly different and much less understood. Taken together, quantitative structural characterization of TATS membrane networks in healthy and diseased myocytes is an essential prerequisite towards better understanding of functional plasticity and pathophysiological reorganization. Here, we present a strategic combination of protocols for direct quantitative analysis of TATS membrane networks in living VMs and AMs. For this, we accompany primary cell isolations of mouse VMs and/or AMs with critical quality control steps and direct membrane staining protocols for fluorescence imaging of TATS membranes. Using an optimized workflow for confocal or superresolution TATS image processing, binarized and skeletonized data are generated for quantitative analysis of the TATS network and its components. Unlike previously published indirect regional aggregate image analysis strategies, our protocols enable direct characterization of specific components and derive complex physiological properties of TATS membrane networks in living myocytes with high throughput and open access software tools. In summary, the combined protocol strategy can be readily applied for quantitative TATS network studies during physiological myocyte adaptation or disease changes, comparison of different cardiac or skeletal muscle cell types, phenotyping of transgenic models, and pharmacological or therapeutic interventions.
Bioengineering, Issue 92, cardiac myocyte, atria, ventricle, heart, primary cell isolation, fluorescence microscopy, membrane tubule, transverse-axial tubule system, image analysis, image processing, T-tubule, collagenase
Test Samples for Optimizing STORM Super-Resolution Microscopy
Institutions: National Physical Laboratory.
STORM is a recently developed super-resolution microscopy technique with up to 10 times better resolution than standard fluorescence microscopy techniques. However, as the image is acquired in a very different way than normal, by building up an image molecule-by-molecule, there are some significant challenges for users in trying to optimize their image acquisition. In order to aid this process and gain more insight into how STORM works we present the preparation of 3 test samples and the methodology of acquiring and processing STORM super-resolution images with typical resolutions of between 30-50 nm. By combining the test samples with the use of the freely available rainSTORM processing software it is possible to obtain a great deal of information about image quality and resolution. Using these metrics it is then possible to optimize the imaging procedure from the optics, to sample preparation, dye choice, buffer conditions, and image acquisition settings. We also show examples of some common problems that result in poor image quality, such as lateral drift, where the sample moves during image acquisition and density related problems resulting in the 'mislocalization' phenomenon.
Molecular Biology, Issue 79, Genetics, Bioengineering, Biomedical Engineering, Biophysics, Basic Protocols, HeLa Cells, Actin Cytoskeleton, Coated Vesicles, Receptor, Epidermal Growth Factor, Actins, Fluorescence, Endocytosis, Microscopy, STORM, super-resolution microscopy, nanoscopy, cell biology, fluorescence microscopy, test samples, resolution, actin filaments, fiducial markers, epidermal growth factor, cell, imaging
Determination of Protein-ligand Interactions Using Differential Scanning Fluorimetry
Institutions: University of Exeter.
A wide range of methods are currently available for determining the dissociation constant between a protein and interacting small molecules. However, most of these require access to specialist equipment, and often require a degree of expertise to effectively establish reliable experiments and analyze data. Differential scanning fluorimetry (DSF) is being increasingly used as a robust method for initial screening of proteins for interacting small molecules, either for identifying physiological partners or for hit discovery. This technique has the advantage that it requires only a PCR machine suitable for quantitative PCR, and so suitable instrumentation is available in most institutions; an excellent range of protocols are already available; and there are strong precedents in the literature for multiple uses of the method. Past work has proposed several means of calculating dissociation constants from DSF data, but these are mathematically demanding. Here, we demonstrate a method for estimating dissociation constants from a moderate amount of DSF experimental data. These data can typically be collected and analyzed within a single day. We demonstrate how different models can be used to fit data collected from simple binding events, and where cooperative binding or independent binding sites are present. Finally, we present an example of data analysis in a case where standard models do not apply. These methods are illustrated with data collected on commercially available control proteins, and two proteins from our research program. Overall, our method provides a straightforward way for researchers to rapidly gain further insight into protein-ligand interactions using DSF.
Biophysics, Issue 91, differential scanning fluorimetry, dissociation constant, protein-ligand interactions, StepOne, cooperativity, WcbI.
From Fast Fluorescence Imaging to Molecular Diffusion Law on Live Cell Membranes in a Commercial Microscope
Institutions: Scuola Normale Superiore, Instituto Italiano di Tecnologia, University of California, Irvine.
It has become increasingly evident that the spatial distribution and the motion of membrane components like lipids and proteins are key factors in the regulation of many cellular functions. However, due to the fast dynamics and the tiny structures involved, a very high spatio-temporal resolution is required to catch the real behavior of molecules. Here we present the experimental protocol for studying the dynamics of fluorescently-labeled plasma-membrane proteins and lipids in live cells with high spatiotemporal resolution. Notably, this approach doesn’t need to track each molecule, but it calculates population behavior using all molecules in a given region of the membrane. The starting point is a fast imaging of a given region on the membrane. Afterwards, a complete spatio-temporal autocorrelation function is calculated correlating acquired images at increasing time delays, for example each 2, 3, n repetitions. It is possible to demonstrate that the width of the peak of the spatial autocorrelation function increases at increasing time delay as a function of particle movement due to diffusion. Therefore, fitting of the series of autocorrelation functions enables to extract the actual protein mean square displacement from imaging (iMSD), here presented in the form of apparent diffusivity vs average displacement. This yields a quantitative view of the average dynamics of single molecules with nanometer accuracy. By using a GFP-tagged variant of the Transferrin Receptor (TfR) and an ATTO488 labeled 1-palmitoyl-2-hydroxy-sn
-glycero-3-phosphoethanolamine (PPE) it is possible to observe the spatiotemporal regulation of protein and lipid diffusion on µm-sized membrane regions in the micro-to-milli-second time range.
Bioengineering, Issue 92, fluorescence, protein dynamics, lipid dynamics, membrane heterogeneity, transient confinement, single molecule, GFP
A Strategy to Identify de Novo Mutations in Common Disorders such as Autism and Schizophrenia
Institutions: Universite de Montreal, Universite de Montreal, Universite de Montreal.
There are several lines of evidence supporting the role of de novo
mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo
mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo
mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1
and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo
mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2
. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo
mutations. This is the case for autism and schizophrenia3
. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo
mutations would more frequently come from males, particularly older males4
. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo
mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo
mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
Medicine, Issue 52, de novo mutation, complex diseases, schizophrenia, autism, rare variations, DNA sequencing