As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators.
To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (https://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
27 Related JoVE Articles!
Fabricating Nanogaps by Nanoskiving
Institutions: University of Groningen.
There are several methods of fabricating nanogaps with controlled spacings, but the precise control over the sub-nanometer spacing between two electrodes-and generating them in practical quantities-is still challenging. The preparation of nanogap electrodes using nanoskiving, which is a form of edge lithography, is a fast, simple and powerful technique. This method is an entirely mechanical process which does not include any photo- or electron-beam lithographic steps and does not require any special equipment or infrastructure such as clean rooms. Nanoskiving is used to fabricate electrically addressable nanogaps with control over all three dimensions; the smallest dimension of these structures is defined by the thickness of the sacrificial layer (Al or Ag) or self-assembled monolayers. These wires can be manually positioned by transporting them on drops of water and are directly electrically-addressable; no further lithography is required to connect them to an electrometer.
Chemistry, Issue 75, Materials Science, Chemical Engineering, Electrical Engineering, Physics, Nanotechnology, nanodevices (electronic), Nanoskiving, nanogaps, nanofabrication, molecular electronics, nanowires, fabrication, etching, ultramicrotome, scanning electron microscopy, SEM
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
The Portable Chemical Sterilizer (PCS), D-FENS, and D-FEND ALL: Novel Chlorine Dioxide Decontamination Technologies for the Military
Institutions: United States Army-Natick Soldier RD&E Center, Warfighter Directorate, University of Connecticut Health Center, Lawrence Livermore National Laboratory, Children's Hospital Oakland Research Institute.
There is a stated Army need for a field-portable, non-steam sterilizer technology that can be used by Forward Surgical Teams, Dental Companies, Veterinary Service Support Detachments, Combat Support Hospitals, and Area Medical Laboratories to sterilize surgical instruments and to sterilize pathological specimens prior to disposal in operating rooms, emergency treatment areas, and intensive care units. The following ensemble of novel, ‘clean and green’ chlorine dioxide technologies are versatile and flexible to adapt to meet a number of critical military needs for decontamination6,15
. Specifically, the Portable Chemical Sterilizer (PCS) was invented to meet urgent battlefield needs and close critical capability gaps for energy-independence, lightweight portability, rapid mobility, and rugged durability in high intensity forward deployments3
. As a revolutionary technological breakthrough in surgical sterilization technology, the PCS is a Modern Field Autoclave that relies on on-site, point-of-use, at-will generation of chlorine dioxide instead of steam. Two (2) PCS units sterilize 4 surgical trays in 1 hr, which is the equivalent throughput of one large steam autoclave (nicknamed “Bertha” in deployments because of its cumbersome size, bulky dimensions, and weight). However, the PCS operates using 100% less electricity (0 vs. 9 kW) and 98% less water (10 vs. 640 oz.), significantly reduces weight by 95% (20 vs. 450 lbs, a 4-man lift) and cube by 96% (2.1 vs. 60.2 ft3
), and virtually eliminates the difficult challenges in forward deployments of repairs and maintaining reliable operation, lifting and transporting, and electrical power required for steam autoclaves.
Bioengineering, Issue 88, chlorine dioxide, novel technologies, D-FENS, PCS, and D-FEND ALL, sterilization, decontamination, fresh produce safety
Characterization Of Multi-layered Fish Scales (Atractosteus spatula) Using Nanoindentation, X-ray CT, FTIR, and SEM
Institutions: U.S. Army Engineer Research and Development Center, University of Alabama, U.S. Army Engineer Research and Development Center.
The hierarchical architecture of protective biological materials such as mineralized fish scales, gastropod shells, ram’s horn, antlers, and turtle shells provides unique design principles with potentials for guiding the design of protective materials and systems in the future. Understanding the structure-property relationships for these material systems at the microscale and nanoscale where failure initiates is essential. Currently, experimental techniques such as nanoindentation, X-ray CT, and SEM provide researchers with a way to correlate the mechanical behavior with hierarchical microstructures of these material systems1-6
. However, a well-defined standard procedure for specimen preparation of mineralized biomaterials is not currently available. In this study, the methods for probing spatially correlated chemical, structural, and mechanical properties of the multilayered scale of A. spatula
using nanoindentation, FTIR, SEM, with energy-dispersive X-ray (EDX) microanalysis, and X-ray CT are presented.
Bioengineering, Issue 89, Atractosteus spatula, structure-property relation, nanoindentation, scan electron microscopy, X-ray computed tomography, Fourier transform infrared (FTIR) spectroscopy
Examining the Role of Nasopharyngeal-associated Lymphoreticular Tissue (NALT) in Mouse Responses to Vaccines
Institutions: U.S. Army Medical Research Institute of Infectious Diseases.
The nasopharyngeal-associated lymphoreticular tissues (NALT) found in humans, rodents, and other mammals, contribute to immunity in the nasal sinuses1-3
. The NALT are two parallel bell-shaped structures located in the nasal passages above the hard palate, and are usually considered to be secondary components of the mucosal-associated lymphoid system4-6
. Located within the NALT are discrete compartments of B and T lymphocytes interspersed with antigen-presenting dendritic cells4,7,8
. These cells are surrounded by an epithelial cell layer intercalated with M-cells that are responsible for antigen retrieval from the mucosal surfaces of the air passages9,10
. Naive lymphocytes circulating through the NALT are poised to respond to first encounters with respiratory pathogens7
. While NALT disappear in humans by the age of two years, the Waldeyer's Ring and similarly structured lymphatic organs continue to persist throughout life6
. In contrast to humans, mice retain NALT throughout life, thus providing a convenient animal model for the study of immune responses originating within the nasal sinuses11
Cultures of single-cell suspensions of NALT are not practical due to low yields of mononuclear cells. However, NALT biology can be examined by ex vivo
culturing of the intact organ, and this method has the additional advantage of maintaining the natural tissue structure. For in vivo
studies, genetic knockout models presenting defects limited to NALT are not currently available due to a poor understanding of the developmental pathway. For example, while lymphotoxin-α knockout mice have atrophied NALT, the Peyer's patches, peripheral lymph nodes, follicular dendritic cells and other lymphoid tissues are also altered in these genetically manipulated mice12,13
. As an alternative to gene knockout mice, surgical ablation permanently eliminates NALT from the nasal passage without affecting other tissues. The resulting mouse model has been used to establish relationships between NALT and immune responses to vaccines1,3
. Serial collection of serum, saliva, nasal washes and vaginal secretions is necessary for establishing the basis of host responses to vaccination, while immune responses originating directly from NALT can be confirmed by tissue culture. The following procedures outline the surgeries, tissue culture and sample collection necessary to examine local and systemic humoral immune responses to intranasal (IN) vaccination.
Infectious Diseases, Issue 66, Immunology, nasal vaccination, nasopharyngeal-associated lymphoreticular tissue, mouse, antibody, mucosal immunity, NALT ablation, NALT culture, NALT-deficient mice
Reconstitution of a Kv Channel into Lipid Membranes for Structural and Functional Studies
Institutions: University of Texas Southwestern Medical Center at Dallas.
To study the lipid-protein interaction in a reductionistic fashion, it is necessary to incorporate the membrane proteins into membranes of well-defined lipid composition. We are studying the lipid-dependent gating effects in a prototype voltage-gated potassium (Kv) channel, and have worked out detailed procedures to reconstitute the channels into different membrane systems. Our reconstitution procedures take consideration of both detergent-induced fusion of vesicles and the fusion of protein/detergent micelles with the lipid/detergent mixed micelles as well as the importance of reaching an equilibrium distribution of lipids among the protein/detergent/lipid and the detergent/lipid mixed micelles. Our data suggested that the insertion of the channels in the lipid vesicles is relatively random in orientations, and the reconstitution efficiency is so high that no detectable protein aggregates were seen in fractionation experiments. We have utilized the reconstituted channels to determine the conformational states of the channels in different lipids, record electrical activities of a small number of channels incorporated in planar lipid bilayers, screen for conformation-specific ligands from a phage-displayed peptide library, and support the growth of 2D crystals of the channels in membranes. The reconstitution procedures described here may be adapted for studying other membrane proteins in lipid bilayers, especially for the investigation of the lipid effects on the eukaryotic voltage-gated ion channels.
Molecular Biology, Issue 77, Biochemistry, Genetics, Cellular Biology, Structural Biology, Biophysics, Membrane Lipids, Phospholipids, Carrier Proteins, Membrane Proteins, Micelles, Molecular Motor Proteins, life sciences, biochemistry, Amino Acids, Peptides, and Proteins, lipid-protein interaction, channel reconstitution, lipid-dependent gating, voltage-gated ion channel, conformation-specific ligands, lipids
Improved Protocol For Laser Microdissection Of Human Pancreatic Islets From Surgical Specimens
Institutions: Paul Langerhans Institute Dresden, University of Technology Dresden, Metabolic Unit University of Pisa, Lilly Corporate Center, Faculty of Medicine Imperial College London, SIB Swiss Institute of Bioinformatics, Hannover Medical School, University of Geneva, University of Technology Dresden, Sanofi-Aventis.
Laser microdissection (LMD) is a technique that allows the recovery of selected cells and tissues from minute amounts of parenchyma 1,2
. The dissected cells can be used for a variety of investigations, such as transcriptomic or proteomic studies, DNA assessment or chromosomal analysis 2,3
. An especially challenging application of LMD is transcriptome analysis, which, due to the lability of RNA 4
, can be particularly prominent when cells are dissected from tissues that are rich of RNases, such as the pancreas. A microdissection protocol that enables fast identification and collection of target cells is essential in this setting in order to shorten the tissue handling time and, consequently, to ensure RNA preservation.
Here we describe a protocol for acquiring human pancreatic beta cells from surgical specimens to be used for transcriptomic studies 5
. Small pieces of pancreas of about 0.5-1 cm3
were cut from the healthy appearing margins of resected pancreas specimens, embedded in Tissue-Tek O.C.T. Compound, immediately frozen in chilled 2-Methylbutane, and stored at -80 °C until sectioning. Forty serial sections of 10 μm thickness were cut on a cryostat under a -20 °C setting, transferred individually to glass slides, dried inside the cryostat for 1-2 min, and stored at -80 °C.
Immediately before the laser microdissection procedure, sections were fixed in ice cold, freshly prepared 70% ethanol for 30 sec, washed by 5-6 dips in ice cold DEPC-treated water, and dehydrated by two one-minute incubations in ice cold 100% ethanol followed by xylene (which is used for tissue dehydration) for 4 min; tissue sections were then air-dried afterwards for 3-5 min. Importantly, all steps, except the incubation in xylene, were performed using ice-cold reagents - a modification over a previously described protocol 6
. utilization of ice cold reagents resulted in a pronounced increase of the intrinsic autofluorescence of beta cells, and facilitated their recognition. For microdissection, four sections were dehydrated each time: two were placed into a foil-wrapped 50 ml tube, to protect the tissue from moisture and bleaching; the remaining two were immediately microdissected. This procedure was performed using a PALM MicroBeam instrument (Zeiss) employing the Auto Laser Pressure Catapulting (AutoLPC) mode. The completion of beta cell/islet dissection from four cryosections required no longer than 40-60 min. Cells were collected into one AdhesiveCap and lysed with 10 μl lysis buffer. Each single RNA specimen for transcriptomic analysis was obtained by combining 10 cell microdissected samples, followed by RNA extraction using the Pico Pure RNA Isolation Kit (Arcturus). This protocol improves the intrinsic autofluorescence of human beta cells, thus facilitating their rapid and accurate recognition and collection. Further improvement of this procedure could enable the dissection of phenotypically different beta cells, with possible implications for better understanding the changes associated with type 2 diabetes.
Medicine, Issue 71, Physiology, Anatomy, Biochemistry, Cellular Biology, Molecular Biology, Immunology, Surgery, Diabetes Mellitus, Type 2, laser microdissection, dissection, human beta cells, intrinsic autofluorescence, pancreas, partial resection, Diabetes type 2, transcriptomic studies, RNA analysis, islet
Engineering a Bilayered Hydrogel to Control ASC Differentiation
Institutions: United States Army Institute of Surgical Research, The University of Texas at Austin.
Natural polymers over the years have gained more importance because of their host biocompatibility and ability to interact with cells in vitro
and in vivo.
An area of research that holds promise in regenerative medicine is the combinatorial use of novel biomaterials and stem cells. A fundamental strategy in the field of tissue engineering is the use of three-dimensional scaffold (e.g., decellularized extracellular matrix, hydrogels, micro/nano particles) for directing cell function. This technology has evolved from the discovery that cells need a substrate upon which they can adhere, proliferate, and express their differentiated cellular phenotype and function 2-3
. More recently, it has also been determined that cells not only use these substrates for adherence, but also interact and take cues from the matrix substrate (e.g., extracellular matrix, ECM)4
. Therefore, the cells and scaffolds have a reciprocal connection that serves to control tissue development, organization, and ultimate function. Adipose-derived stem cells (ASCs) are mesenchymal, non-hematopoetic stem cells present in adipose tissue that can exhibit multi-lineage differentiation and serve as a readily available source of cells (i.e. pre-vascular endothelia and pericytes). Our hypothesis is that adipose-derived stem cells can be directed toward differing phenotypes simultaneously by simply co-culturing them in bilayered matrices1
. Our laboratory is focused on dermal wound healing. To this end, we created a single composite matrix from the natural biomaterials, fibrin, collagen, and chitosan that can mimic the characteristics and functions of a dermal-specific wound healing ECM environment.
Bioengineering, Issue 63, Biomedical Engineering, Tissue Engineering, chitosan, microspheres, collagen, hydrogel, PEG fibrin, cell delivery, adipose-derived stem cells, ASC, CSM
Oscillation and Reaction Board Techniques for Estimating Inertial Properties of a Below-knee Prosthesis
Institutions: University of Northern Colorado, Arizona State University, Iowa State University.
The purpose of this study was two-fold: 1) demonstrate a technique that can be used to directly estimate the inertial properties of a below-knee prosthesis, and 2) contrast the effects of the proposed technique and that of using intact limb inertial properties on joint kinetic estimates during walking in unilateral, transtibial amputees. An oscillation and reaction board system was validated and shown to be reliable when measuring inertial properties of known geometrical solids. When direct measurements of inertial properties of the prosthesis were used in inverse dynamics modeling of the lower extremity compared with inertial estimates based on an intact shank and foot, joint kinetics at the hip and knee were significantly lower during the swing phase of walking. Differences in joint kinetics during stance, however, were smaller than those observed during swing. Therefore, researchers focusing on the swing phase of walking should consider the impact of prosthesis inertia property estimates on study outcomes. For stance, either one of the two inertial models investigated in our study would likely lead to similar outcomes with an inverse dynamics assessment.
Bioengineering, Issue 87, prosthesis inertia, amputee locomotion, below-knee prosthesis, transtibial amputee
Constructing a Collagen Hydrogel for the Delivery of Stem Cell-loaded Chitosan Microspheres
Institutions: United States Army Institute of Surgical Research.
Multipotent stem cells have been shown to be extremely useful in the field of regenerative medicine1-3
. However, in order to use these cells effectively for tissue regeneration, a number of variables must be taken into account. These variables include: the total volume and surface area of the implantation site, the mechanical properties of the tissue and the tissue microenvironment, which includes the amount of vascularization and the components of the extracellular matrix. Therefore, the materials being used to deliver these cells must be biocompatible with a defined chemical composition while maintaining a mechanical strength that mimics the host tissue. These materials must also be permeable to oxygen and nutrients to provide a favorable microenvironment for cells to attach and proliferate. Chitosan, a cationic polysaccharide with excellent biocompatibility, can be easily chemically modified and has a high affinity to bind with in vivo
. Chitosan mimics the glycosaminoglycan portion of the extracellular matrix, enabling it to function as a substrate for cell adhesion, migration and proliferation. In this study we utilize chitosan in the form of microspheres to deliver adipose-derived stem cells (ASC) into a collagen based three-dimensional scaffold6
. An ideal cell-to-microsphere ratio was determined with respect to incubation time and cell density to achieve maximum number of cells that could be loaded. Once ASC are seeded onto the chitosan microspheres (CSM), they are embedded in a collagen scaffold and can be maintained in culture for extended periods. In summary, this study provides a method to precisely deliver stem cells within a three dimensional biomaterial scaffold.
Bioengineering, Issue 64, Biomedical Engineering, Tissue Engineering, chitosan, microspheres, collagen, hydrogel, cell delivery, adipose-derived stem cells, ASC, CSM
Introducing an Angle Adjustable Cutting Box for Analyzing Slice Shear Force in Meat
Institutions: Agriculture and Agri-Food Canada, Universidad de Córdoba, University of Nebraska.
Research indicates the fibre angle of the longissimus
muscle can vary, depending upon location within a steak and throughout the muscle. Instead of using the original fixed 45 ° or 90 ° cutting angle for testing shear force, a variable angle cutting box can be adjusted so the angles of the knives correspond to the fibre angle of each sample. Within 2 min after cooking to an internal temperature of 71 °C on an open-hearth grill set at 210 °C, a 1 cm by 5 cm core is cut from the steak, parallel to muscle fibre direction, using 2 knife blades set 1 cm apart. This warm core is then subjected to the Slice Shear Force protocol (SSF) to evaluate meat texture. The use of the variable angle cutting box and the SSF protocol provides an accurate representation of the maximal shear force, as the slice and muscle fibres are consistently parallel. Therefore, the variable angle cutting box, in conjunction with the SSF protocol, can be used as a high-throughput technique to accurately evaluate meat tenderness in different locations of the longissimus
muscle and, potentially, in other muscles.
Biophysics, Issue 74, Anatomy, Physiology, Physics, Agricultural Sciences, Meat, beef, shear force, tenderness, Warner-Bratzler, muscle angle, fibre, fiber, tissue, animal science
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells
Institutions: Children's Mercy Hospital and Clinics, School of Medicine, University of Missouri-Kansas City.
The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g.
drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2
. RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4
in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.
Genetics, Issue 72, Molecular Biology, Immunology, Medicine, Genomics, Proteins, RNA-seq, Next Generation DNA Sequencing, Transcriptome, Transcription, Thrombin, Endothelial cells, high-throughput, DNA, genomic DNA, RT-PCR, PCR
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Designing a Bio-responsive Robot from DNA Origami
Institutions: Bar-Ilan University.
Nucleic acids are astonishingly versatile. In addition to their natural role as storage medium for biological information1
, they can be utilized in parallel computing2,3
, recognize and bind molecular or cellular targets4,5
, catalyze chemical reactions6,7
, and generate calculated responses in a biological system8,9
. Importantly, nucleic acids can be programmed to self-assemble into 2D and 3D structures10-12
, enabling the integration of all these remarkable features in a single robot linking the sensing of biological cues to a preset response in order to exert a desired effect.
Creating shapes from nucleic acids was first proposed by Seeman13
, and several variations on this theme have since been realized using various techniques11,12,14,15
. However, the most significant is perhaps the one proposed by Rothemund, termed scaffolded DNA origami16
. In this technique, the folding of a long (>7,000 bases) single-stranded DNA 'scaffold'
is directed to a desired shape by hundreds of short complementary strands termed 'staples'
. Folding is carried out by temperature annealing ramp. This technique was successfully demonstrated in the creation of a diverse array of 2D shapes with remarkable precision and robustness. DNA origami was later extended to 3D as well17,18
The current paper will focus on the caDNAno 2.0 software19
developed by Douglas and colleagues. caDNAno is a robust, user-friendly CAD tool enabling the design of 2D and 3D DNA origami shapes with versatile features. The design process relies on a systematic and accurate abstraction scheme for DNA structures, making it relatively straightforward and efficient.
In this paper we demonstrate the design of a DNA origami nanorobot that has been recently described20
. This robot is 'robotic' in the sense that it links sensing to actuation, in order to perform a task. We explain how various sensing schemes can be integrated into the structure, and how this can be relayed to a desired effect. Finally we use Cando21
to simulate the mechanical properties of the designed shape. The concept we discuss can be adapted to multiple tasks and settings.
Bioengineering, Issue 77, Genetics, Biomedical Engineering, Molecular Biology, Medicine, Genomics, Nanotechnology, Nanomedicine, DNA origami, nanorobot, caDNAno, DNA, DNA Origami, nucleic acids, DNA structures, CAD, sequencing
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
An Experimental and Bioinformatics Protocol for RNA-seq Analyses of Photoperiodic Diapause in the Asian Tiger Mosquito, Aedes albopictus
Institutions: Georgetown University, The Ohio State University.
Photoperiodic diapause is an important adaptation that allows individuals to escape harsh seasonal environments via a series of physiological changes, most notably developmental arrest and reduced metabolism. Global gene expression profiling via RNA-Seq can provide important insights into the transcriptional mechanisms of photoperiodic diapause. The Asian tiger mosquito, Aedes albopictus
, is an outstanding organism for studying the transcriptional bases of diapause due to its ease of rearing, easily induced diapause, and the genomic resources available. This manuscript presents a general experimental workflow for identifying diapause-induced transcriptional differences in A. albopictus.
Rearing techniques, conditions necessary to induce diapause and non-diapause development, methods to estimate percent diapause in a population, and RNA extraction and integrity assessment for mosquitoes are documented. A workflow to process RNA-Seq data from Illumina sequencers culminates in a list of differentially expressed genes. The representative results demonstrate that this protocol can be used to effectively identify genes differentially regulated at the transcriptional level in A. albopictus
due to photoperiodic differences. With modest adjustments, this workflow can be readily adapted to study the transcriptional bases of diapause or other important life history traits in other mosquitoes.
Genetics, Issue 93, Aedes albopictus Asian tiger mosquito, photoperiodic diapause, RNA-Seq de novo transcriptome assembly, mosquito husbandry
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif
Vibrio cholerae: Model Organism to Study Bacterial Pathogenesis - Interview
Institutions: University of California Santa Cruz - UCSC.
Microbiology, issue 4, microbial community, Vibrio cholerae, genome
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Interview: HIV-1 Proviral DNA Excision Using an Evolved Recombinase
Institutions: Heinrich-Pette-Institute for Experimental Virology and Immunology, University of Hamburg.
HIV-1 integrates into the host chromosome of infected cells and persists as a provirus flanked by long terminal repeats. Current treatment strategies primarily target virus enzymes or virus-cell fusion, suppressing the viral life cycle without eradicating the infection. Since the integrated provirus is not targeted by these approaches, new resistant strains of HIV-1 may emerge. Here, we report that the engineered recombinase Tre (see Molecular evolution of the Tre recombinase , Buchholz, F., Max Planck Institute for Cell Biology and Genetics, Dresden) efficiently excises integrated HIV-1 proviral DNA from the genome of infected cells. We produced loxLTR containing viral pseudotypes and infected HeLa cells to examine whether Tre recombinase can excise the provirus from the genome of HIV-1 infected human cells. A virus particle-releasing cell line was cloned and transfected with a plasmid expressing Tre or with a parental control vector. Recombinase activity and virus production were monitored. All assays demonstrated the efficient deletion of the provirus from infected cells without visible cytotoxic effects. These results serve as proof of principle that it is possible to evolve a recombinase to specifically target an HIV-1 LTR and that this recombinase is capable of excising the HIV-1 provirus from the genome of HIV-1-infected human cells.
Before an engineered recombinase could enter the therapeutic arena, however, significant obstacles need to be overcome. Among the most critical issues, that we face, are an efficient and safe delivery to targeted cells and the absence of side effects.
Medicine, Issue 16, HIV, Cell Biology, Recombinase, provirus, HeLa Cells
Microbial Communities in Nature and Laboratory - Interview
Institutions: MIT - Massachusetts Institute of Technology.
Microbiology, issue 4, microbial community, biofilm, genome